Upload
lehanh
View
241
Download
0
Embed Size (px)
Citation preview
.
33rd Research Students’ Conferencein Probability and Statistics
12th -15th April 2010Conference Proceedings
Timetable of Events
Monday 12th April
13:00 Registration of Delegates (The Street)
15:00 Afternoon Tea (The Street)
15:30 Opening Address & Plenary Session (MS.01, Maths/Stats Building)
15:30 Opening Address: Prof. Jane L. Hutton (University of Warwick)15:45 Plenary Talk I: Prof. Jim Q. Smith (University of Warwick)16:20 Plenary Talk II: Dr. Jonathan Rougier (University of Bristol)16:55 Announcements/Housekeeping
18:00 Dinner (Rootes Social Building)
19:00 Pub Quiz (Varsity Pub)
Tuesday 13th April
07:30 Breakfast (Rootes Social Building)
09:10 Session 1 (Math/Stats Building)
11:10 Refreshments (The Street)
11:30 Session 2 (Math/Stats Building)
13:30 Lunch (The Street)
14:30 Session 3 (Math/Stats Building)
16:10 Poster Session and Refreshments (The Street)
18:00 Dinner (Rootes Social Building)
19:00 Evening Entertainment (Coventry City Centre)19:00 Bus Collection by Students Union to Cross Point Business Park
(Bowling and Cinema)19:30 Bus Collection by Students Union to Town Hall (Pub Crawl)
3
22:00 Bus Collection from Bowling and Cinema to Campus
23:30 First Bus Collection from Pub to Campus
00.30 Second Bus Collection from Pub to Campus
Wednesday 14th April
07:30 Breakfast (Rootes Social Building)
09:10 Session 4 (Math/Stats Building)
11:10 Refreshments (The Street)
11:30 Session 5 (Math/Stats Building)
13:30 Lunch (The Street)
14:30 Sponsors’ Talks (Math/Stats Building)
16:10 Sponsors’ Wine Reception (The Street)
18:15 Bus Collection to Conference Dinner (Coventry Transport Museum)
22:15 First Bus Collection to Campus
23:45 Second Bus Collection to Campus
Thursday 15th April
07:30 Breakfast (Rootes Social Building)
09:30 Delegates Depart
4
Contents1 Welcome from the Organisers 7
2 The City and University 8
3 Campus Map 11
4 University Facilities 12
5 Accommodation 12
6 Conference Details 136.1 Meals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136.2 Sponsors’ Wine Reception . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7 Help, Information and Telephone Numbers 147.1 Departmental Computing and Internet Access . . . . . . . . . . . . . . 14
8 Instructions 158.1 For Chairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158.2 For Speakers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158.3 For Displaying a Poster . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168.4 Prizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
9 Plenary Session 179.1 Professor Jane L. Hutton (University of Warwick) . . . . . . . . . . . . . 179.2 Professor Jim Q. Smith (University of Warwick) . . . . . . . . . . . . . . 189.3 Dr. Jonathan Rougier (University of Bristol) . . . . . . . . . . . . . . . . 19
10 List of Sponsors’ Talks 20
11 Talks Schedule 2111.1 Monday 12th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2111.2 Tuesday 13th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2211.3 Wednesday 14th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
12 Talk Abstracts by Session 3212.1 Tuesday 13th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
12.1.1 Session 1a: Image Analysis . . . . . . . . . . . . . . . . . . . . . 3212.1.2 Session 1b: Computational Statistics . . . . . . . . . . . . . . . . 3612.1.3 Session 1c: Operational Research . . . . . . . . . . . . . . . . . . 3912.1.4 Session 1d: Statistical Inference . . . . . . . . . . . . . . . . . . . 4212.1.5 Session 2a: Medical Statistics I . . . . . . . . . . . . . . . . . . . . 4512.1.6 Session 2b: Financial . . . . . . . . . . . . . . . . . . . . . . . . . 4812.1.7 Session 2c: Elicitation and Epidemiology . . . . . . . . . . . . . 5112.1.8 Session 2d: Multivariate Statistics . . . . . . . . . . . . . . . . . . 54
5
12.1.9 Session 3a: Genetics . . . . . . . . . . . . . . . . . . . . . . . . . . 5612.1.10 Session 3b: Medical Statistics II . . . . . . . . . . . . . . . . . . . 5912.1.11 Session 3c: Dimension Reduction . . . . . . . . . . . . . . . . . . 6212.1.12 Session 3d: Environmental . . . . . . . . . . . . . . . . . . . . . . 65
12.2 Wednesday 14th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6812.2.1 Session 4a: Medical Statistics III . . . . . . . . . . . . . . . . . . . 6812.2.2 Session 4b: Point Processes and Spatio-temporal Statistics . . . 7112.2.3 Session 4c: General . . . . . . . . . . . . . . . . . . . . . . . . . . 7412.2.4 Session 4d: Graphical Models and Extreme Value Theory . . . . 7712.2.5 Session 5a: Experimental Design and Population Genetics . . . 7912.2.6 Session 5b: Censoring in Survival Data and Non-Parametric
Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8212.2.7 Session 5c: Time Series and Diffusions . . . . . . . . . . . . . . . 8512.2.8 Session 5d: Probability . . . . . . . . . . . . . . . . . . . . . . . . 8812.2.9 Session 6a: Sponsors’ Talks . . . . . . . . . . . . . . . . . . . . . 9112.2.10 Session 6b: Sponsors’ Talks . . . . . . . . . . . . . . . . . . . . . 9212.2.11 Session 6c: Sponsors’ Talks . . . . . . . . . . . . . . . . . . . . . . 93
13 Poster Abstracts by Author 94
14 RSC 2011: Cambridge University 103
15 Sponsors’ Advertisements 104
16 RSC History 115
17 Delegate List 116
6
1 Welcome from the Organisers
Welcome to the 33rd Research Students’ Conference in Statistics and Probability (RSC2010). This year the conference is hosted by the University of Warwick. The RSC isan annual event aiming to provide postgraduate statisticians and probabilists withan appropriate forum to present their research. This four day event is organised bypostgraduates, for postgraduates, providing an excellent opportunity to make con-tacts and discuss work with other students, who have similar interests.
For many students this will be your first experience of presenting your work, withsome of you also taking the opportunity to chair a session. For those of you attendingand not presenting, we hope that you will benefit greatly from observing others andnetworking with researchers working in a similar field.
Finally, we will be looking for potential hosts for RSC 2012. If you think yourinstitution would be keen to take part in such an exciting project, please let us know.Next year the conference will be held in Cambridge.
Mouna Akacha, Flavio Goncalves, Bryony Hill and Jennifer RogersConference Organisers
7
2 The City and University
The University of Warwick is one of the leading UK research universities and isranked number 1 in the Midlands. Consistently ranked in the top ten of UK uni-versities, it is an entrepreneurial institution that has a large positive impact on localand regional communities. The University is located in the heart of England, 3 miles(5 kilometres) from Coventry city centre, on the border with Warwickshire.
Coventry
Coventry is a city and metropolitan borough in the county of West Midlands in Eng-land. Coventry is the 9th largest city in England and the 11th largest in the UnitedKingdom. It is also the second largest city in the English Midlands, after Birmingham.
Coventry is situated 95 miles (153 km) northwest of London and 19 miles (30 km)east of Birmingham, and is farthest from the coast of any city in Britain. Althoughharbouring a population of almost a third of a million inhabitants, Coventry is notamongst the English Core Cities Group, due to its proximity to Birmingham.
Coventry was also the world’s first ‘twin’ city when it formed a twinning relation-ship with the Russian city of Stalingrad (now Volgograd) during World War II. Therelationship developed through ordinary people in Coventry who wanted to showtheir support for the Soviet Red Army during the Battle of Stalingrad. The city isnow twinned with Dresden and with 27 other cities around the world.
Coventry Cathedral is one of the newer cathedrals in the world, having beenbuilt following the World War II bombing of the ancient cathedral by the Luftwaffe.Coventry motor companies have contributed significantly to the British motor indus-try, and it has two universities, the city centre-based Coventry University as well asthe University of Warwick on the southern outskirts.
In the late 19th century, Coventry became a major centre of bicycle manufacture,with the industry being pioneered by Rover. By the early 20th century, bicycle man-ufacture had evolved into motor manufacture, and Coventry became a major centreof the British motor industry. Over 100 different companies have produced motorvehicles in Coventry, but car production came to an end in 2006 as the last car rolledoff the lines at Peugeot’s Ryton plant. Production was transferred to a new plantnear Trnava, Slovakia, with the help of EU grant aid to Peugeot: this made Peugeotdeeply unpopular in the city. The design headquarters of Jaguar Cars is still in thecity at their Whitley plant and although they ceased vehicle assembly at their BrownsLane plant in 2004, they still continue some operations from there.
A major visitor attraction in Coventry city centre is the free-to-enter CoventryTransport Museum, which has the largest collection of British-made road vehicles inthe world and will be the venue for our Conference Dinner. The most notable exhibitsare the world speed record-breaking cars, Thrust2 and ThrustSSC. The museum re-ceived a major refurbishment in 2004 which included the creation of a striking newentrance as part of the city’s Phoenix Initiative project. The revamp saw the museumexceed its projected five-year visitor numbers within the first year alone, and it was
8
a finalist for the 2005 Gulbenkian Prize.The most famous daughter of Coventry is Lady Godiva. Her ride through the
streets of the city has passed into legend. According to the popular story, Lady Go-diva took pity on the people of Coventry, who were suffering grievously under herhusband’s oppressive taxation. Lady Godiva appealed again and again to her hus-band, who obstinately refused to remit the tolls. At last, weary of her entreaties, hesaid he would grant her request if she would strip naked and ride through the streetsof the town. Lady Godiva took him at his word and, after issuing a proclamation thatall persons should stay indoors and shut their windows, she rode through the town,clothed only in her long hair. Today a statue positioned in the heart of the city centreis reminding of her braveness.
The University
The establishment of the University of Warwick was given approval by the govern-ment in 1961 and received its Royal Charter of Incorporation in 1965. It straddles theboundary between the City of Coventry and the County of Warwickshire. The ideafor a university in Coventry was mooted shortly after the conclusion of the SecondWorld War, but it was a bold and imaginative partnership of the City and the Countywhich brought the University into being on a 400 acre site jointly granted by the twoauthorities. Since then, the University has incorporated the former Coventry Collegeof Education in 1978 and has extended its land holdings by the purchase of adjoiningfarm land.
The University initially admitted a small intake of graduate students in 1964 andtook its first 450 undergraduates in October 1965. In October 2009, the student pop-ulation was over 21,598 of which around 9008 are postgraduates. 25% of the studentbody comes from overseas and over 125 countries are represented on the campus.The University has 29 academic departments and over 50 research centres and insti-tutes, in four Faculties: Arts, Medicine, Science and Social Sciences. The Universityhosts two HEFCE Centres for Excellence in Learning and Teaching (CETLs): CAPI-TAL and Reinvention. The new Medical School took its first students on an innova-tive 4-year accelerated postgraduate programme in September 2000. In summer 2004the first 64 students graduated from the school. In October 2004 the combined intakeof the Warwick Medical School was 403, making it one of the largest in the country.
From its beginnings, the University has sought to be excellent in both teachingand research. It has now secured its place as one of the UK’s leading research univer-sities, confirmed by the results of the government’s Research Assessment Exercisessince 1986. In all of these, Warwick has been placed in the top half dozen or so of uni-versities for the quality of its research. The results of the 2008 Research AssessmentExercise (RAE) again reiterate Warwick’s position as one of the UK’s leading researchuniversities, with Warwick ranked at 7th overall in the UK (based on multi-facultyinstitutions).
The University of Warwick campus was recently voted the best campus in the UK.It’s a lively, cosmopolitan place with its own shops, banks, bars and restaurants - an
9
exciting place to live and work with everything you could need close at hand. Thereis a great sense of community at Warwick: The campus is home to students and stafffrom over 120 different countries and from all backgrounds, and is a great resourcefor the local community with excellent facilities such as Warwick Arts Centre andthe University Sports Centre. The campus is continually developing; in August 2008Warwick Digital Laboratory was opened by Prime Minister Gordon Brown, and theindoor tennis centre at Westwood campus was opened in March 2008. The campusis situated on three adjacent sites: Central campus, Gibbet Hill campus and West-wood campus. There are lakes and woods, trees and landscaped gardens but whilstthe campus has many green open spaces, inside the buildings ground-breaking re-search is taking place and academics and students are sharing their knowledge andexperience.
The Department
The Department of Statistics at the University of Warwick is one of the largest UKconcentrations of researchers in statistics and probability, and the synergy betweenprobabilistic and statistical research is particularly strong. The research environmentis vibrant, with a large and active community of PhD students and postdoctoralresearchers, excellent library, computing and other research support facilities, andsustained programmes of research seminars, workshops and international visitors.There are strong research links with other disciplines both at Warwick and externally.
Research related activities (seminars, workshops, visitors, etc.) take place mainlythrough three long-term initiatives: CRiSM (Centre for Research in Statistical Method-ology), P@W (Probability at Warwick) and RISCU (Risk Initiative and Statistical Con-sultancy Unit). CRiSM is funded by EPSRC and HEFCE, as well as Warwick, as anational Science and Innovation investment. P@W is a focus for inter-departmentalprobability research at Warwick and for the organisation of externally open researchworkshops and training events in probability, while RISCU provides resources fordeveloping applied research collaborations with industry, commerce, governmentand other outside bodies, and with other academic disciplines.
The Department’s research ranges from probability theory, through computationand statistical methodology, to substantive applications in many different fields. Inthe most recent national Research Assessment Exercise (RAE 2008), the Departmenthad 70% of its activity rated as internationally excellent (grade 3* or higher), withmore than a quarter classed as world leading (grade 4*). For publications by membersof the department, please see individual staff web pages.
The Department leads the EPSRC-funded Academy for PhD Training in Statistics,a collaboration with eight other prominent UK research groups to organise intensivecourses for first-year PhD students. From 2010 a further new feature of our PhD pro-vision is the EPSRC-funded MASDOC initiative for doctoral training at the interfacebetween statistics and applied mathematics.
10
3 Campus Map
HE
ALT
H C
EN
TR
E R
OA
D
11
1
2
4
5
6
8
79
1012
13
15
16
14
37
34
36
44
33 28
51
6670
17
19 69
57
65
38
59
58
67
67
40
20
60
50
31
30
2362
46
64
47
27
43
45
42
18
26 35
24
32
3953
63
61
54
52
5649
71
4825
48
55
22
29
21
68
3
41
A B C D E F G H
1
2
3
4
5
6
7
8
9
SYMBOLS University Buildings
Student Residences
Car Parks
Building Entrances
For the most up-to-date version of this map go to warwick.ac.uk/go/mapsFor further information see the University web site or mobile site www.m.warwick.ac.uk
BUILDING KEY
Wheelchair Accessible Entrances
Controlled Access
Footpaths
Footpaths/Cycleways
One way Road
Bus Stop
No Entry
International Automotive Research Centre (IARC)..........1 ......E4Arden ...........................................................................2 ...... F2Argent Court, incorporating Estates, AdsFab & Jobs.ac.uk ......................................3 ..... G3Arthur Vick ...................................................................4 ...... F6Avon Building, incorporating Drama Studio ...................5 ..... G2Benefactors ..................................................................6 ..... C5Biological Sciences .......................................................7 ......D8Biomedical Research ....................................................8 ......D8Gibbet Hill Farmhouse ..................................................9 ..... C8Chaplaincy .................................................................10 ......D5Chemistry ...................................................................11 ......D4Claycroft .....................................................................12 ..... G5Computer Science ......................................................13 ......E4Coventry House ..........................................................14 ......D5Cryfield, Redfern & Hurst ............................................15 ......B5Dining & Social Building Westwood ............................16 ..... G2Education, Institute of, incorporating Multimedia CeNTRE & TDA Skills Test Centre .............17 ..... H2Engineering ................................................................18 ......E4Engineering Management Building ..............................19 ...... F2Games Hall .................................................................20 ......E2Gatehouse ..................................................................21 ......D3Health Centre .............................................................22 ......D6Heronbank .................................................................23 ......A4Humanities Building ....................................................24 ......E4International House .....................................................25 ..... C6International Manufacturing Centre .............................26 ......E4IT Services Elab level 4 ...............................................27 ..... H2IT Services levels 1-3 ..................................................28 ..... H2Jack Martin .................................................................29 ......E6Lakeside .....................................................................30 ......B3Lakeside Apartments ..................................................31 ......B2Library ........................................................................32 ......D4Lifelong Learning ........................................................33 ..... G2Medical School Building..............................................34 ......D8Mathematics & Statistics (Zeeman Building) ................35 ...... F4Maths Houses ............................................................36 ......E8
Medical Teaching Centre ............................................37 ......D8Millburn House ...........................................................38 ...... F3Modern Records Centre & BP Archive ........................39 ......D5Music .........................................................................40 ..... H2Nursery .......................................................................41 ..... C3Physical Sciences .......................................................42 ......D4Physics .......................................................................43 ......D4Porters & Postroom ....................................................44 ..... G1Psychology .................................................................45 ......E5Radcliffe .....................................................................46 ..... C4Ramphal Building .......................................................47 ......D4Rootes .......................................................................48 C6/D6Rootes Building ..........................................................49 ..... C5Scarman .....................................................................50 ..... C3Science Education ......................................................51 ..... H2Shops .........................................................................52 ......D5Social Sciences ..........................................................53 ......D4Sports Centre .............................................................54 ......E5Sports Pavilion ............................................................55 ......A5Students’ Union ..........................................................56 ......D5Tennis Centre .............................................................57 ...... F2Tocil ............................................................................58 ...... F5University House, incorporating Learning Grid ............59 ......E2Vanguard Centre .........................................................60 ..... G3Warwick Arts Centre, incorporating Music Centre .......61 ......D5Warwick Business School (WBS) ................................62 ......D4
WBS Main Reception, Scarman Rd ........................62 ......D3WBS Social Sciences .............................................63 ......D5WBS Teaching Centre ............................................64 ..... C4
Warwick Digital Laboratory .........................................65 ...... F4WarwickPrint ..............................................................66 ..... H2Westwood ..................................................................67 G1/G2Westwood Gatehouse OCNCE ...................................68 ..... H2Westwood House, incorporating Occupational Health, Counselling & DARO Calling Room .............................69 ..... G2Westwood Teaching and Westwood Lecture Theatre..70 ..... H2Whitefields ..................................................................71 ......D5
A full-size version of the map is provided in the Conference pack.
11
4 University Facilities
Everything you will need during your stay can be found on the University campus.Situated on 700 acres of rural parkland, the campus ’village’ environment has its ownbanks, bars, shops and outlets.
All meals - breakfasts, lunches, dinners and morning/afternoon refreshments-are included in the conference registration. However, if you find yourself still hungrythere are a number of bars and cafes open around campus and also a small Costcuttersupermarket located next to the Student Union. Inside Costcutter there is also a PostOffice and Copyshop (for printing, photocopying and binding).
A 10-minute walk takes you to the local Tesco, Boots and Iceland at Cannon ParkShopping Centre. Coventry’s high street stores are a bus-ride away, as is LeamingtonSpa’s range of boutique and high street shops.
The Student Union building (possibly the largest in Europe) has recently been re-built in a 11 million redevelopment project. As well as a new entertainments venue,there are also more spaces for those who just want to go out and have a drink, includ-ing the new pub ’The Dirty Duck’ which serves its own local ale, and ‘The TerraceBar’ which looks out over the Piazza. Downstairs in the Union are branches of twomajor UK banks - Barclays and Natwest- and also a pharmacy and hair salon, shouldyou need them!
If you are coming by rail or bus (e.g., National Express or Megabus), you shouldcome to Coventry. Travel Coventry service number 12 (which display the destination:University of Warwick or Leamington Spa) run from the city centre bus station (PoolMeadow), via Coventry Rail Station, to the University Central Campus, passing theWestwood campus en route.
Free car parking is available for all delegates staying on campus. You can requestan access code for car parks 7, 8 and 15 (see campus map) from Rootes Social Buildingreception when you check in.
5 Accommodation
Accommodation is in en-suite rooms on campus, 5 mins walk from both the Math/StatsDepartment and Rootes Social Building where breakfast and dinners will be served.
All rooms have towels and toiletries. Kitchen facilities are available although allmeals are provided.
Internet is available in all bedrooms. Details of how to log onto the system willbe displayed in each individual bedroom, but delegates will need to bring their ownEthernet cable. These can be purchased from Rootes Reception should anyone not bein possession of one.
Rooms will be available after 15:00 for check in, however luggage can be left atRootes Reception in Rootes Social Building until this time. All bedrooms must bevacated by 9:30am on Thursday 15th.
12
6 Conference Details
On Monday 12th, delegates should arrive at the Math/Stats Building (Zeeman Build-ing) between 13:00 and 15:00 to register and collect conference packs. These containall the information needed during the conference. If you are presenting a poster,please submit it at registration. The conference will open with the plenary session at15:30 in the Math/Stats Department.
On Tuesday 13th and Wednesday 14th, delegates will have the opportunity topresent talks. Posters will be on display in The Street of the Math/Stats Buildingthroughout the afternoon of Tuesday 13th, with the poster session commencing at16:10. Presenters are encouraged to be near their posters during this session in orderto answer questions from interested participants.
6.1 Meals
Breakfasts and evening meals (except on the evening of the conference dinner) willbe served in Rootes Restaurant on campus. Lunches and morning/afternoon refresh-ments will be served in Math/Stats Department where the conference will be held.
Please note that on the first day of the conference (Mon 12th) we will not be pro-viding any lunch. However there are plenty of eating facilities available on campus,and tea, coffee and cakes will be served before the plenary session.
Dinner on the Wednesday evening will be at Coventry Transport Museum. Youwill be expected to wear formal attire (no jeans or trainers please). Before the mealyou will be given an opportunity to have a look around the museum, and afterwardsthere will be a Ceilidh, followed by a DJ.
Coaches to the conference dinner will pick delegates up by the Students Union at18:15.
6.2 Sponsors’ Wine Reception
The Sponsors’ Reception will be held in The Street in the Maths/Stats Building onthe Wednesday at 16:10, prior to the conference dinner. Please take this opportunityto talk with our sponsors and visit their displays to learn more about possible careeropportunities.
13
7 Help, Information and Telephone Numbers
Department address:Dept of StatisticsUniversity of WarwickCoventryCV4 7ALTelephone: 024 7657 4812Fax: 024 7652 4532
Emergency Numbers:University Security: 024 7652 2083 (also for general emergencies)Conference Organiser: 077 2998 4952 (Jennifer Rogers, resident on campus)
Transport:Swift Taxis Coventry: 024 7676 7676Trinity Street Taxis: 024 7663 1631Bus information: 0871 200 2233National Rail Enquiries: 08457 484950
7.1 Departmental Computing and Internet Access
Free wireless internet access will be available to all delegates in The Street area ofthe Maths/Stats building. You will be given the username and password in order toaccess this service via your laptops after the Plenary Session.
14
8 Instructions
8.1 For Chairs
• Please arrive at the appropriate seminar room five minutes before the start ofyour session. Familiarise yourself with the visual equipment.
• Packs will be left in each seminar room. Do not remove the packs or any of theircontents from the seminar room. If you think something might be missing fromthe pack, please contact one of the organisers.
• You should clearly introduce yourself and each speaker in turn.
• It is very important that we stick to the schedule. Therefore please start thesession on time, use the time remaining cards, and make sure that questions arenot allowed to delay the rest of the session.
• If a speaker fails to show, please advise the audience to attend a talk in an alter-native seminar room. Do not move the next talk forward.
• After each talk, thank the speaker, encourage applause, and open the floor toquestions (from students only). If no questions are forthcoming, ask one your-self.
• Use the 5 min and 1 min flash cards to assist the speaker in finishing on time.
8.2 For Speakers
• Each seminar room will contain a computer, data projector and white/blackboard.
• Arrive five minutes before the start of the session, introduce yourself to thechair and load your presentation onto the computer.
• Presentations must be pdf or Powerpoint (ppt or pptx) files. No other format isacceptable.
• Talks are strictly fifteen minutes plus five minutes for questions. Anyone goingover this time will be asked to stop by the chair.
• Your chair will let you know when you have five minutes and then one minuteremaining for your presentation.
15
8.3 For Displaying a Poster
• The poster session will be held in The Street area of the Math/Stats Building at16:10 on Tuesday 13th April.
• Please submit posters upon registration on Monday 12th April.
• Posters will be erected by conference organisers.
• During the poster session, it is advisable to be near your poster in order toanswer questions from interested participants.
• Posters will also be displayed throughout Tuesday afternoon.
• Please ensure that your poster is removed by 17:30 on Tuesday.
• Posters should be of no greater size then A1.
8.4 Prizes
The three best talks and the best poster, as voted for by all delegates, will receiveprizes in the form of book vouchers from our sponsors CUP and Wiley-Blackwelland additionally, courtesy of the Royal Statistical Society:
The RSS will offer the best three presentations and the best poster from theRSC2010 conference the opportunity to present their work at the RSS2010conference which will be held from 13-17 September in Brighton. Thethree best presentations will participate in a special session at the confer-ence and the poster will be presented alongside the other posters at theevent. The prize will be in the form of free registration at the conferencefor the four winners. (The registration fee includes many meals and socialevents but not transport or accommodation).
Further details about the conference can be found at: www.rss.org.uk/rss2010
16
9 Plenary Session
9.1 Professor Jane L. Hutton (University of Warwick)
Opening Address
Jane L. Hutton is a Professor of Statistics in the Department of Statistics, University ofWarwick. She works in medical statistics, with special interests in survival analysis,meta-analysis and non-random data. Accelerated failure time models are a particu-lar focus in her research in survival analysis. She has major collaborations in cerebralpalsy and epilepsy. Her work with Professor Peter Pharoah and Dr Allan Colver, onlife expectancy in cerebral palsy, has had a substantial effect on the size of awardsin medico-legal cases. This work is widely cited nationally and internationally. Inepilepsy, she has contributed to many Cochrane reviews of anti-epileptic drugs. Sheis currently working on a research project with Dr Tony Marson, of Liverpool Uni-versity Neurosciences Department. She has written extensively on ethics and philos-ophy of statistics. She has contributed to Research Council ethics guidelines.
17
9.2 Professor Jim Q. Smith (University of Warwick)
Title: How to do Research Creatively
Abstract
Making the shift from being a taught student to a researcher is a challenging one.We all develop the skill to deliver to our teachers what they want to see in ex-ams. Now suddenly we must develop a completely distinct set of skills wherethe point of our work is to produce something *different* from what other re-searchers do. How can this transition to becoming a creative researcher in Statis-tics or Probability be managed? In this short talk I will outline some techniques Ihave developed over the years: some of which I hope you might find useful.
Jim Q. Smith is a Professor of Statistics at Warwick University and has researched awide range of topics both theoretical and applied, but always Bayesian. He is cur-rently Chair of RISCU, the consultancy arm of the statistics department and has closeresearch ties with various companies and government departments.
18
9.3 Dr. Jonathan Rougier (University of Bristol)
Title: Complex systems: Accounting for model limitations
Abstract
Many complex systems, notably environmental systems like climate, are highlystructured, and numerical models, known as simulators, play an important rolein prediction and control. It is crucial to account for limitations in simulators,since these can be substantial, and can vary substantially from one simulator toanother. These limitations can be categorised in terms of input uncertainty, para-metric uncertainty, and structural uncertainty. The talk explains this framework,and the particular challenge of accounting for simulator limitations in dynamicalsystems, with illustrations from climate science and natural hazards.
Jonty Rougier is an applied statistician working in the area of computer experiments,particularly for complex environmental systems like climate. He studied Economicsand then Statistics at Durham, the latter as a postdoc working with Michael Gold-stein and Allan Seheult. He is currently a Lecturer in Statistics in the Department ofMathematics at the University of Bristol.
19
10 List of Sponsors’ Talks
On Wednesday 14th several of the conference sponsors will be giving presentationsas part of the main conference programme, providing an opportunity to learn abouttheir statistical work.
Session 6a, Room MS.01, Chair: Jennifer Rogers
Time Sponsor Speaker Title Pg
14:30 International Bio-metric Society
Richard Ems-ley
The International Biometric Society:What can it offer to PostgraduateStudents?
91
15:05 Pfizer Phil Wood-ward
Bayesian Design & Analysis of Ex-periments
91
15:40 SmartOdds Robert Mas-trodomenico
An Introduction to Football Mod-elling at Smartodds
91
Session 6b, Room MS.04, Chair: Mouna Akacha
Time Sponsor Speaker Title Pg
14:30 Shell Wayne Jones Making Decisions with Confidence -Statistics the Shell Way
92
15:05 AHL, Man GroupPLC
Martin Lay-ton
An Introduction to AHL 92
Session 6c, Room MS.05, Chair: Flavio B Goncalves
Time Sponsor Speaker Title Pg
15:05 Royal StatisticalSociety
HelenThornewell
Support from the RSS and theirYoung Statisticians Section
93
15:40 Lloyds BankingGroup
Bill Fite Opportunities in Probability andStatistical Modelling at Lloyds Bank-ing Group Decision Science
93
20
11 Talks Schedule
11.1 Monday 12th April
Session – PlenaryChair: Jennifer RogersRoom: MS.01, Maths/Stats Building
Time Speaker Title Pg
15:30 Hutton, Jane L. Opening Address 1715:45 Smith, Jim Q. How to do Research Creatively 1816:20 Rougier, Jonathan Complex systems: Accounting for model limita-
tions19
21
11.2 Tuesday 13th April
Session 1a: Image AnalysisChair: Bryony HillRoom: MS.01
Time Speaker Title Pg
09:10 Doshi, Susan Statistical image reconstruction for cone-beamcomputed tomography
32
09:35 Fallaize, Christopher Matching Shapes of Different Sizes 3310:00 Khatun, Mahmuda Morphological Granulometry for Image Texture
Analysis and Classification34
10:25 Yan, Lei Statistical Threshold of Magnetoencephalo-graphic (MEG) Data
34
10:50 Llewelyn, Stephanie Statistical Modelling of Fingerprints 35
Session 1b: Computational StatisticsChair: Flavio B GoncalvesRoom: MS.04
Time Speaker Title Pg
09:10 Cainey, Joe Performance of Pseudo-Marginal MCMC Algo-rithms
36
09:35 O’Hagan, Adrian Computational Advances in Fitting MixtureModels via the EM Algorithm
36
10:00 Prangle, Dennis Summary statistics for Approximate BayesianComputation
37
10:25 Raychaudhuri, Clare Investigating methods to approximate the ex-pectation efficiently
38
10:50 Vrousai, Dina Sampling from the posterior- MCMC, Impor-tance resampling or Numerical integration?
39
22
Session 1c: Operational ResearchChair: Fiona SammutRoom: MS.05
Time Speaker Title Pg
09:10 Anacleto-Junior, Os-valdo
Bayesian forecasting models for traffic manage-ment systems
39
09:35 Aslett, Louis JM Modelling and Inference for Networks with Re-pairable Redundant Subsystems
40
10:00 May, Benedict Multi-Armed Bandit with Regressor Problems 4010:25 Moffatt, Joanne Analysing strategy in the sprint race in track cy-
cling using logistic regression41
10:50 Hashim, Siti R.M. Interpretation Problems in Multivariate ControlChart
42
Session 1d: Statistical InferenceChair: Stephen BurgessRoom: A1.01
Time Speaker Title Pg
09:10 Jamalzadeh, Amin Developing Effect Sizes for Non-Normal Data 4209:35 Jesus, Joao Inference without likelihood 4310:00 McElduff, Fiona Maximum likelihood estimation of discrete dis-
tribution parameters using R43
10:25 Ogundimu, Em-manuel
Investigating the impact of missing data onCronbach’s alpha estimates and Confidence In-tervals
44
10:50 Zwiernik, Piotr Posets, Mobius functions and tree-cumulants 44
23
Session 2a: Medical Statistics IChair: Mouna AkachaRoom: MS.01
Time Speaker Title Pg
11:30 Ewings, Sean Modelling Blood Glucose Concentration for Peo-ple with Type 1 Diabetes
45
11:55 Smith, Joanna Methods for the Analysis of Asymmetry 4612:20 Strawbridge, Alexan-
derMeasurement error correction of the associa-tion between fasting blood glucose and coronaryheart disease - a structural fractional polynomialapproach
46
12:45 Verykouki, Eleni Modelling the effects of antibiotics on carriagelevels of MRSA
47
13:10 Roloff, Verena Planning future studies based on the conditionalpower of a random-effects meta-analysis
48
Session 2b: FinancialChair: Murray PollockRoom: MS.04
Time Speaker Title Pg
11:30 Lapinski, Tomasz Modelling the rank system with Gibbs, Bose Ein-stein or Zipf Law. Application in MathematicalFinance
48
11:55 Michelbrink, Daniel A Martingale Approach to Active Portfolio Se-lection
49
12:20 Pham, Duy Measuring vega risks of Bermudan swaptionsunder the Markov-Functional model
49
12:45 Shahtahmassebi,Golnaz
Mathematical and Statistical Models for Predict-ing Financial Behaviour
50
13:10 Wang, Chun An optimal stopping problem of finite horizonwith regime switching
50
24
Session 2c: Elicitation and EpidemiologyChair: Michelle StantonRoom: MS.05
Time Speaker Title Pg
11:30 Elfadaly, Fadlalla G. On Eliciting Expert Opinion in Generalized Lin-ear Models
51
11:55 Noosha, Mitra Discordancy between the prior and data usingconjugate priors
51
12:20 Ford, Ashley P. Indian Buffet Epidemics. A Bayesian Approachto Modelling Heterogeneity
52
12:45 Worby, Colin A hidden Markov model to analyse MRSA trans-mission in hospital wards
53
13:10 Walker, Neil Estimating the size of a badger population usinglive capture and post-mortem data
53
Session 2d: Multivariate StatisticsChair: Nathan HuntleyRoom: A1.01
Time Speaker Title Pg
11:30 Fayomi, Aisha Cauchy Principal Components Analysis 5411:55 Sweeney, James Approximate Joint Statistical Inference for Large
Spatial Datasets54
12:20 Tsagris, Michael Multivariate outliers, the forward search and theCronbach’s Reliability Coefficient
55
12:45 Mohammad, Rofizah Bayesian Analysis in Multivariate Data 5513.10 Sammut, Fiona Some Aspects of Compositional Data 56
25
Session 3a: GeneticsChair: Dennis PrangleRoom: MS.01
Time Speaker Title Pg
14:30 Evangelou, Marina Incorporating available biological knowledge toexplore genome-wide association data
56
14:55 Fowler, Anna Informed Bayesian Clustering of Gene Expres-sion Levels
57
15:20 Burgess, Stephen An application of Bayesian techniques forMendelian randomization to assess causality ina large meta-analysis
58
15:45 Cairns, Jonathan BayesPeak: A Hidden Markov Model foranalysing ChIP-seq experiments
59
Session 3b: Medical Statistics IIChair: Helen ThornewellRoom: MS.04
Time Speaker Title Pg
14.30 Hee, Siew Wan Designing a Series of Phase II Trials 5914:55 Magirr, Dominic Response-Adaptive Block Randomization in Bi-
nary Endpoint Clinical Trials60
15:20 Ren, Shijie Bayesian clinical trial designs for survival out-comes
60
15:45 Yeung, Wai Yin The power of the biased coin design for clinicaltrials
61
26
Session 3c: Dimension ReductionChair: James SweeneyRoom: MS.05
Time Speaker Title Pg
14:30 Chand, Sohail Oracle properties of Lasso-type methods in Re-gression problems
62
14:55 Khan, Md. HasinurRahaman
Penalized Weighted Least Squares Variable Se-lection Method for AFT Models with High Di-mensional Covariates
62
15:20 Serradilla, Javier Latent Variable Models for Process Monitoring 6315:45 Yusoff, Nur Fatihah
MatA study of item selection using principal compo-nent analysis and correspondence analysis
63
Session 3d: EnvironmentalChair: Andrew SmithRoom: A1.01
Time Speaker Title Pg
14:30 Jones, Emma M. Using a Bayesian Hierarchical Model for Tree-Ring Dating
65
14:55 Norris, Beth Not another species richness estimator?! 6515:20 Oxlade, Rachel Uncertainty analysis for multiple ecosystem
models using Bayesian emulators66
15:45 Powell, Helen Estimating biologically plausible relationshipsbetween air pollution and health
67
27
11.3 Wednesday 14th April
Session 4a: Medical Statistics IIIChair: Fiona McElduffRoom: MS.01
Time Speaker Title Pg
09:10 Iglesias, Alberto Al-varez
An application of survival trees to the study ofcardiovascular disease
68
09:35 Dooley, Cara Analysis of an Observational Study to in Col-orectal Cancer Patients
68
10:00 O’Keeffe, Aidan Causal Inference in Longitudinal Data Analysis:A Case Study in the Epidemiology of PsoriaticArthritis
69
10:25 Thomas, MariaRoopa
Design and analysis of dose escalation trials 70
10:50 Nicholls, Stuart Modelling parental decisions for newbornbloodspot screening
70
Session 4b: Point Processes and Spatio-Temporal StatisticsChair: Chris FallaizeRoom: MS.04
Time Speaker Title Pg
09:10 Marek, Patrice Poisson Process Parameter Estimation from Datain Bounded Domain
71
09:35 Bakar, KhandokerShuvo
A Comparison of Bayesian Space-Time Modelsfor Ozone Concentration Levels
72
10:00 Proctor, Iain Multi-level models for ecological response appli-cations
72
10:25 Stanton, Michelle A Spatio-temporal modelling of Meningitis Inci-dence in sub-Saharan Africa
73
10:50 Smith, Andrew Denoising UK House Prices 73
28
Session 4c: GeneralChair: Michael TsagrisRoom: MS.05
Time Speaker Title Pg
09:10 Gollini, Isabella Mixture of Latent Trait Analyzers 7409:35 Klapper, Jennifer A wavelet based approach to HPLC data analy-
sis74
10:00 Bhattacharya, Sakya-jit
Delete-Replace Identity For A Set Of Indepen-dent Observations
75
10:25 Sanderson, Ria Modelling Main Contractor Status for the NewOrders Survey
75
10:50 Wilson, Kevin Bayes linear kinematics in the analysis of failurerates
76
Session 4d: Graphical Models and Extreme Value TheoryChair: Guy FreemanRoom: A1.01
Time Speaker Title Pg
09:10 Wadsworth, Jenny Uncertainty in Choice of Measurement Scale forExtreme Value Analysis
77
09:35 Youngman, Ben Modelling extremal phenomena using differentdata sources
77
10:00 Byrne, Simon Parametrisation of graphical models 7810:25 Caimo, Alberto Bayesian inference for Social Network Models 78
29
Session 5a: Experimental Design and Population GeneticsChair: Andrew SimpkinRoom: MS.01
Time Speaker Title Pg
11:30 Khadim, MudakkarM.
Canonical Analysis of Multi-Stratum ResponseSurface Designs & Standard Errors of Eigenval-ues
79
11:55 Martin, Kieran D-optimal design of experiments for a dynamicmodel with correlated observations
79
12:20 Thornewell, Helen Vulnerability: A 2nd Criterion to Distinguish be-tween Equally-Optimal BIBDs
80
12:45 Kershaw, Emma Surfing In One Dimension 8013:10 Mair, Colette Dimension Reduction for Human Genomic SNP
Variation81
Session 5b: Censoring in Survival Data and Non-Parametric StatisticsChair: Jennifer RogersRoom: MS.04
Time Speaker Title Pg
11:30 Elsayed, Hisham Ab-del Hamid
Parametric Survival Model with Time-dependent Covariates for Right CensoredData
82
11:55 Staplin, Natalie Assessing the Effect of Informative Censoring inPiecewise Parametric Survival Models
83
12:20 Thom, Howard Dealing with Censoring in Quality AdjustedSurvival Analysis and Cost Effectiveness Anal-ysis
83
12:45 Aboalkhair, AhmadM
Nonparametric Predictive Inference for SystemReliability
84
13:10 Toupal, Tomas Nonparametric Estimation of Reliability of TwoRandom Variables Using Kernel Estimation ofDensity
85
30
Session 5c: Time Series and DiffusionsChair: Alexander StrawbridgeRoom: MS.05
Time Speaker Title Pg
11:30 Bhattacharya, Arnab Sequential Integrated Nested Laplace Approxi-mation
85
11:55 Killick, Rebecca Finding changepoints in a Gulf of Mexico hurri-cane hindcast dataset
86
12:20 Stevens, Kara Prediction Intervals of the Local Spectrum Esti-mate
87
12:45 Suda, David Discrete- and Continuous-time Approaches toImportance Sampling on Diffusions
87
13:10 Villalobos, IsadoraAntoniano
Bayesian inference for diffusions based on exactsimulation
88
Session 5d: ProbabilityChair: Duy PhamRoom: A1.01
Time Speaker Title Pg
11:30 Barranon, Antonio A.Ortiz
A New Bivariate Generalized Pareto Model 88
11:55 Huntley, Nathan Backward Induction and Subtree Perfectness 8912:20 Lee, Rui Xin On the Convergence of Continuously Monitored
Barrier Options Under Markov Processes89
12:45 Wagnerova, Eva Distortion of Probability Models 90
31
12 Talk Abstracts by Session
12.1 Tuesday 13th April
12.1.1 Session 1a: Image Analysis
Session Room: MS.01Chair: Bryony Hill
Start time 09:10
STATISTICAL IMAGE RECONSTRUCTION FOR CONE-BEAM
COMPUTED TOMOGRAPHYSusan Doshi and Chris Jennison
University of Bath, UK
Keywords: Bayesian image analysis, Cone-beam CT, Image-guided radiotherapy
In image-guided radiotherapy, the accuracy of patient positioning is determined us-ing images of internal anatomy in addition to the traditional external markers. Thisgives confidence that radiation prescribed for the treatment of cancer will be deliv-ered to the desired volume. Treatment is usually delivered five days a week for sev-eral weeks, with imaging used on many of these occasions. X-ray cone-beam com-puted tomography (CBCT) is increasingly being used for this purpose. An X-raysource moves in a circular trajectory around the patient and planar projection imagesare acquired at increments of 1◦. The data in these images are used to reconstruct a3D representation of the patient.Conventional Fourier-based reconstruction techniques rely on relatively noise-freeprojection images, with the entire patient diameter being included in each projec-tion, and with a complete set of projections over more than 180◦. Satisfying each ofthese requirements can be difficult. In addition, metallic fiducial markers may be im-planted to help track the movement of soft tissues. These improve visualisation onthe projection images, but may cause artefacts in the 3D reconstruction.Statistical reconstruction techniques can cope naturally with these obstacles. In thispresentation, we will introduce the Bayesian approach to image reconstruction. Mod-elling may be carried out in a number of spaces: the 2D projection image, the 3D pa-tient space, or the 3D sinogram space (formed by ’stacking’ the 2D projections alonga third axis indexed by the projection angle). We can use a normal likelihood, orinclude aspects of the physical system in a more realistic model. Inference on thestructure of the patient is based on MCMC sampling from the posterior distribution,and choices of prior and likelihood are made by considering the trade-off betweenaccurate inference and the time taken to perform this sampling. The methods will bedemonstrated using data acquired on clinical systems.
32
Start time 09:35
MATCHING SHAPES OF DIFFERENT SIZESChristopher Fallaize
University of Leeds, UK
Keywords: Bayesian alignment, MCMC, Scale factor, Statistical shape analysis, Unlabelledlandmarks
The shape of an object is the information invariant under the full similarity trans-formations of rotation, translation and rescaling. In statistical shape analysis, we areconcerned with analysing differences in shape between individual objects or popula-tions. To this end, we first seek some optimal registration which removes the effectsof orientation, location and size, so that any remaining differences are due to genuinedifferences in shape.Objects are often reduced to k points, known as landmarks, inm dimensions and thuscan be represented as k ×m point configurations. In labelled shape analysis the cor-respondence between landmarks on different configurations is known. Unlabelledshape analysis deals with the more complex situation where the correspondence be-tween landmarks is unknown. Green and Mardia (Biometrika, 2006 pp. 235–254)developed a Bayesian methodology for the pairwise alignment of two unlabelledconfigurations using the rigid body transformations of rotation and translation.We present the extension to full similarity shape by introducing a scaling factor to themodel. Taking one of the configurations as a fixed reference, the aim is to estimate thetransformation of the other configuration onto the reference whilst simultaneouslyidentifying the matching between landmarks. Particular challenges include efficientsimulation from a non-standard distribution for the scale factor and the desire for asymmetrical setup to ensure that equal inferences are drawn regardless of which con-figuration is taken as the reference. Possible applications include automated imageanalysis (where objects nearer or further away have different sizes) and biologicalmorphometrics (where objects at different growth stages may be of different sizes).We shall illustrate our methodology with examples using both real and artificial datasets.
33
Start time 10:00
MORPHOLOGICAL GRANULOMETRY FOR IMAGE TEXTURE
ANALYSIS AND CLASSIFICATIONMahmuda Khatun1, Dr Alison Gray1 and Prof. Steve Marshall2
1 Department of Mathematics and Statistics, University of Strathclyde, Glasgow2 Department of Electronic and Electrical Engineering, University of Strathclyde,
GlasgowKeywords: Image analysis, Morphology, Opening, Granulometry, Pattern spectrum,
Structuring element
An important area of digital image analysis is the analysis of texture images. A sta-tistical approach to image texture classification based on granulometric moments isdescribed here. Mathematical morphology provides a set of non-linear techniques toextract shaped-based information from an image, using image probes in the form of‘structuring elements’. Opening granulometry is based on a sequence of morpholog-ical openings using scaled structuring elements. As the scale increases, more imageareas are removed. Pattern spectra are formed by normalising the removed area bythe total image area. Since the pattern spectrum is a probability density function itsmoments can be calculated. The pattern spectrum moments can be used as texturefeatures for classification.This work concerns sequences of texture images which evolve in time, and the clas-sification of a new image to a point in time. Statistical models are being built to relategranulometric moments to evolution time directly, using training images for whichboth the evolution parameters and the time state are known. Each model can be usedfor back-prediction of evolution time of a new image from its observed granulometricmoments. Better predictions are expected by combining different models.
Start time 10:25
STATISTICAL THRESHOLD OF
MAGNETOENCEPHALOGRAPHIC (MEG) DATALei Yan, C.J. Brignell and C. D. Litton
School of Mathematical Sciences, University of Nottingham, UK
Keywords: FWER, Random field, Permutation method
In this presentation, we show how Magnetoencephalographic (MEG) data can beanalyzed statistically using parametric (standard and random field) and nonpara-metric methods (permutation, bootstap). Compared to parametric statistical tests,nonparametric statistical tests provide complete freedom to the user with respect tothe test statistic by means of which the experimental conditions are compared. We
34
propose statistical thresholds that control the familywise error rate (FWER) acrosstime or across both space and time. These approaches use the distribution of teststatistics under the null hypothesis to find FWER thresholds. We show the originalpermutation tests can not control FWER while experimental conditions have samevariance-covariance structure, which is difficult to achieve in practice. Unlike pre-vious permutation based tests in neuroimaging, we also address the problem by apermutation based tests without assumption that different experimental conditionshave same variance-covariance structure.
Start time 10:50
STATISTICAL MODELLING OF FINGERPRINTSStephanie Llewelyn
University of Sheffield, UK
Keywords: Fingerprints, Identification, Modelling
It is believed that fingerprints are determined in embryonic development. Unlikeother personal characteristics the fingerprint appears to be a result of a random pro-cess. For example fingerprints of identical twins (whose DNA is identical) are dis-tinct, and extensive studies have found little evidence of a genetic relationship interms of types of fingerprint, certainly at the small scale. At a larger scale the patternof ridges on fingerprints can be categorised as belonging to one of five basic forms:loops (left and right), whorls, arches and tented arches. The population frequenciesof these types show little variation with ethnicity and a list of the types occurring onthe ten digits can be used as an initial basis for identification of individuals. However,such a system would not uniquely identify an individual although the frequency ofcertain combinations could be extremely small. At a smaller scale various minutiaeor singularities can be observed in a fingerprint. These include ridge endings andbifurcations, amongst others. Typical fingerprints have several hundred of these aswell as two key points (with the exception of a simple arch) referred to as the coreand delta, which are focal points of the overall pattern of ridges. Modern identifica-tion systems are based upon endings and bifurcations, not least because they are theeasiest to determine automatically from image analysis. The configuration of theseminutiae is unique to the individual.The presentation will outline the history of use of fingerprints, illustrate some of thesefeatures used for identification and discuss ways in which statistical models could bedeveloped to generate realistic fingerprints using data obtained from fingermarks.
35
12.1.2 Session 1b: Computational Statistics
Session Room: MS.04Chair: Flavio B Goncalves
Start time 09:10
PERFORMANCE OF PSEUDO-MARGINAL MCMCALGORITHMS
Joe CaineyStatistics Group, University of Bristol
Keywords: MCMC, Latent Variable, Pseudo-Marginal, Metropolis-Hastings, GIMH,Autocorrelation
Given the problem of sampling from a distribution π (θ), the Metropolis-Hastings(MH) algorithm is often used to generate a Markov Chain with invariant distributionπ (θ). In cases where π (θ) is intractable, or too complex to evaluate, a different ap-proach must be taken. It is often possible to instead construct a Markov Chain withinvariant distribution π (θ, z), where z can be missing data, or latent variables whichmake π (θ, z) easier to evaluate, which is known as data augmentation. A pseudo-marginal algorithm attempts to combine the precision of the marginal sampler withthe computational efficiency of data augmentation techniques.Grouped Independence Metropolis-Hastings (GIMH) is a pseudo-marginal algorithmwhich uses importance sampling to estimate π (θ). When running any form of MCMCsampler, the performance of the resulting chain is of great importance. We show thatas the number of importance sampling particles approaches infinity the performanceof the chain produced by the GIMH algorithm converges to that of the marginal al-gorithm.
Start time 09:35
COMPUTATIONAL ADVANCES IN FITTING MIXTURE
MODELS VIA THE EM ALGORITHMAdrian O’Hagan
University College Dublin, Ireland
Keywords: Expectation-Maximisation Algorithm, Starting values, Multimodal likelihoodfunctions, Convergence rate, Multicycle ECM Algorithm
The Expectation-Maximisation (EM) Algorithm is a popular tool for deriving maxi-mum likelihood estimates in a large family of statistical models. Chief among its at-tributes is the property that the algorithm always drives the likelihood uphill. How-ever it can be difficult to assess convergence and, in the case of multimodal likelihood
36
functions, the algorithm may become trapped at a local maximum.We introduce a variety of schemes to promote algorithmic efficiency. A range of”burn-in” functions are described. These can produce initialising values for the EMalgorithm of a higher quality than those arising from simply employing randomstarts. The use of likelihood monitoring and multicycle features allows maximiza-tion steps to be ordered and targeted on parameter subsets. Outcomes are comparedwith those from the model-based clustering package mclust in R where a hierarchi-cal clustering initialisation is performed. The overall goal is to increase convergencerates to the global likelihood maximum and/or to attain the global maximum in ahigher percentage of cases.
Start time 10:00
SUMMARY STATISTICS FOR APPROXIMATE BAYESIAN
COMPUTATIONDennis Prangle
Lancaster University
Keywords: ABC, MCMC, Bayesian statistics
Approximate Bayesian Computation (ABC) methods are a family of algorithms for‘likelihood-free’ Bayesian inference. The domain of use is models where numericalevaluation of the likelihood is impossible or impractical, but from which data can eas-ily be simulated. For example, over the last decade ABC has allowed investigation ofrealistic but previously intractable models in population genetics. Other applicationsinclude infectious disease epidemiology and missing data models.ABC operates by simulating data Xsim from the model of interest for many param-eter values θ and constructing an approximation to the posterior from those θ valuesfor which the associated Xsim closely matches the observations Xobs. Algorithmshave been proposed which implement this idea within the frameworks of rejectionand importance sampling, Markov Chain Monte Carlo and Sequential Monte Carlo.A key insight in past research is that to achieve practical acceptance rates, ‘closenessof match’ should be judged by some norm ||S(Xsim)− S(Xobs)||where S(.) are lowdimensional summary statistics of a data set. However the problem of how to chooseS well is an open question in the literature.This talk uses visual examples to introduce the main ideas of ABC and describe anovel methodology for constructing efficient summary statistics. Theoretical supportfor the method is also briefly outlined.
37
Start time 10:25
INVESTIGATING METHODS TO APPROXIMATE THE
EXPECTATION EFFICIENTLYClare Raychaudhuri
University of Bristol, UK
Keywords: variance reduction, Monte-Carlo methods
Suppose we wish to estimate the expectation of a function g (x) with respect to thestandard Gaussian distribution, i.e. the Gaussian distribution with mean 0 and vari-ance the identity matrix. One method to estimate this expectation is to use basicMonte-Carlo methods 1
n
∑g (xi). However basic Monte-Carlo methods may require
large number of function evaluations for the estimate to converge. Luckily it is of-ten possible to speed up this convergence using control variates. In order to use acontrol variate it is required that there exists a function α (x) for which the expecta-tion is known, E {α (x)} = c, and which has a strong correlation with g (x). This newestimator µ for E {g (x)} is
µ =1
n
{n∑i=1
g (xi) + [c− α (xi)] B
}.
The variance of this estimator is minimised when B = B∗
B∗ = Var {α (X)}−1 Cov {α (X) , g (X)} .
However often B∗ is not known so it has to be estimated using linear regression. Inthe case where α (X) = X and so c = 0 this problem is equivalent to estimating theintercept of a linear regression of (1,X) on Y. Unfortunately this is a biased estimatorof E {g (x)} since the same data points are used both to estimate B and to estimate µ.Therefore a method such as jack-knife should be applied to reduce the estimator biasand provide an estimate of the variance of the estimator.While using linear regression is appropriate when n � q + p, (where p is the dimen-sion of y and q is the dimension of x), it is not appropriate if there is only a smallsample size n. In this case dimensional reduction techniques such as principle com-ponent analysis or partial least squares analysis can be considered.
38
Start time 10:50
SAMPLING FROM THE POSTERIOR- MCMC, IMPORTANCE
RESAMPLING OR NUMERICAL INTEGRATION?Dina Vrousai and John Haslett
Trinity College Dublin, Ireland
Keywords: Numerical Integration, MCMC
Many methods and algorithms have been developed to sample from the posteriordistribution. Importance resampling (IR) and particularly Markov Chain Monte Carlo(MCMC) methods are widely used for this purpose. Sampling from the posterior us-ing these methods doesn’t require the knowledge of the normalizing constant. An-other alternative is to compute the normalizing constant and then to sample from theposterior. This can be very computationally demanding, especially in high dimen-sional problems.We are using an R package, lately released, which implements multidimensional in-tegration algorithms, only for Riemann integrals (unit hypercube). The aim is tocompare the special characteristics of these three methods (IR, MCMC, Numericalintegration) using an application on blood lactate data. We are using Kriging withGaussian processes to model these data. We then compare the posterior distributionsfor our model obtained using these three different methods (MCMC, IR and Numer-ical integration).
12.1.3 Session 1c: Operational Research
Session Room: MS.05Chair: Fiona Sammut
Start time 09:10
BAYESIAN FORECASTING MODELS FOR TRAFFIC
MANAGEMENT SYSTEMSOsvaldo Anacleto-Junior and Dr. Catriona Queen
The Open University, Department of Mathematics and Statistics
Many roads have real-time traffic flow data available which can be used as part of atraffic management system. In a traffic management system, traffic flows are moni-tored over time with the aim of reducing congestion by taking actions, such as impos-ing variable speed limits or diverting traffic onto alternative routes, when problems
39
arise. Reliable short-term forecasting models of traffic flows are crucial for monitor-ing traffic flows and, as such, are crucial to the ultimate success of any traffic man-agement system.The model used here for forecasting traffic flows uses a directed acyclic graph (DAG)in which the nodes represent the time series of traffic flows at the various data collec-tion sites, and the links between nodes represent the conditional independence andcausal structure between flows at different sites. The DAG breaks the multivariatemodel into simpler univariate components, each one being a dynamic linear model.This makes the model computationally simple, no matter how complex the trafficnetwork is, and allows the forecasting model to work in real-time, as required by anytraffic management system.This talk will report current research in the development of this class of model withparticular reference to a busy motorway junction in the UK.
Start time 09:35
MODELLING AND INFERENCE FOR NETWORKS WITH
REPAIRABLE REDUNDANT SUBSYSTEMSLouis JM Aslett and Simon P Wilson
Trinity College Dublin, Ireland
Keywords: Bayesian inference, reliability theory, phase-type distributions,telecommunications, MCMC
We consider the problem of modelling the reliability of a network of subsystemswhere each subsystem has redundancy and is repairable. The motivation for thiswork is large-scale telecommunications networks.The time to failure of the subsystem hardware is modeled by an appropriate Markovprocess and is hence a phase-type distribution. The network structure defines a fail-ure rule in terms of the states of the subsystems, allowing computation by MonteCarlo simulation of the time to failure distribution for the network. When data onthe reliability of the subsystems are available, this can be incorporated via modifica-tions to an existing Bayesian inference approach to update the prediction of networkreliability.
Start time 10:00
MULTI-ARMED BANDIT WITH REGRESSOR PROBLEMSBenedict May and Dr. David Leslie
University of Bristol, UK
Keywords: Bandit Problem, Reinforcement Learning, Linear Regression, NonparametricRegression
40
The multi-armed bandit problem is a simple example of the exploitation/explorationtrade-off generally inherent in reinforcement learning problems. An agent is taskedwith learning from experience how to sequentially make decisions in order to max-imize average reward. In the extension considered, the agent is presented with aregressor before making each decision. The agent has to balance the tendency toexplore apparently sub-optimal actions (in order to improve regression function es-timates) against the tendency to exploit the current estimates (in order to maximisereward). Study of several past approaches to similar problems has indicated particu-lar desirable properties for the policy used. These properties motivate the choice andstudy of the algorithm that features in this work. The theoretical properties of thealgorithm have been studied and it has been tested on both linear and nonparametricregression problems. The intuitive algorithm has useful convergence properties and,compared to many conventional methods, performs well in simulations.
Start time 10:25
ANALYSING STRATEGY IN THE SPRINT RACE IN TRACK
CYCLING USING LOGISTIC REGRESSIONJoanne Moffatt1, Philip Scarf1, Louis Passfield2 and Ian McHale1
1 Centre for Operations Management, Management Science and Statistics, SalfordBusiness School, University of Salford, UK
2 Centre for Sports Studies, University of Kent, UKKeywords: Individual sprint race, Track cycling, Strategies, Logistic regression
Competitors and coaches in sports continually try to gain a competitive edge by op-timising strategy. One highly tactical contest is the individual sprint in track cycling,where one small strategic error can potentially cost the competitor the race. Theaim of this research is to use statistical analysis to give insight into strategies in thisevent. Eight logistic regression models were developed to predict the probability ofthe leading rider winning from different stages of the race, based on how the raceproceeded just before each stage. Logistic regression was selected since it is suitableto use when there are a large number of potential strategies. It also has the advantageof being simple to implement and straightforward to interpret. Key strategies weresuccessfully identified from the models, including how the leading rider can defendtheir lead and how the following rider optimises their chances of overtaking.
41
Start time 10:50
INTERPRETATION PROBLEMS IN MULTIVARIATE CONTROL
CHARTSiti R.M. Hashim
University of Sheffield, UK
Keywords: multivariate control chart, multivariate processes, quality control, diagnosticmethod, correlation
Multivariate control charts have assumed a major role in multivariate processes qual-ity control. Unlike univariate control charts, the interpretation of the out-of controlsignals triggered from a multivariate control chart is not an easy straight forwardtask. Practitioners and quality control researchers have proposed a few diagnosticmethods to deal with this problem. Unfortunately, most of the proposed methodsdo not perform similarly under different type of mean shifts and correlation. As aresult, different diagnostic methods adopted might lead to different interpretationsand conclusions. In this study, a few diagnostic methods are selected and tested un-der different type of mean shifts and correlations. The performances of the selecteddiagnostic methods are measured by the percentage of correct identification with re-spect to the different mean shifts and correlations. A general guideline will be givenwith respect to the selection of the appropriate diagnostic methods in interpreting thesignals produced by multivariate control chart.
12.1.4 Session 1d: Statistical Inference
Session Room: A1.01Chair: Stephen Burgess
Start time 09:10
DEVELOPING EFFECT SIZES FOR NON-NORMAL DATAAmin Jamalzadeh
Durham University, UK
Keywords: Effect size, hypothesis test, two sample t-test, Normal distribution, Weibulldistribution
The classical hypothesis testing model seeks to determine whether to reject the hy-pothesis of the non-existence of a phenomenon. Therefore, statistical significancedoes not necessarily provide information about the importance or magnitude of thephenomenon. There are indicators, known as effect sizes (ES), which are used bysome to quantify the degree to which a phenomenon exists. Statistical significance
42
is not a direct measure of ES, but there exists a functional relationship between thesample size, the ES and the p-value. For this reason, if the sample size is sufficientlylarge even a weak ES may appear as statistically significant. The ES has been mainlyintroduced and investigated based on an assumption of normal distribution for theunderlying population. However, there are many circumstances where the popula-tions are non-Normal, or depend on scale and shape and not just location parameter.We will review how to interpret the effect size for the two independent sample com-parison studies when the assumption of normality holds. We will also investigatehow results change when the parameters of location and scale both change for a nor-mal population. We introduce explorations for effect sizes for phenomena in whichthe variable follows a distribution with shape and scale parameters. As a special case,power analysis and sample size determinations will be discussed for continuous anddiscrete Weibull distributions for two sample comparison. Finally, for an application,we show how to detect the effect of some factors on the amount of time spent and thenumber of pages viewed while a user surfs on an E-commerce website.
Start time 09:35
INFERENCE WITHOUT LIKELIHOODJoao Jesus
University College London, UK
Keywords: Estimating Functions, Method of Moments, Efficiency, Minimal Variance,Simulation, Rainfall
Maximum likelihood estimation has been shown to be optimal for numerous classesof statistical models. However there are still many cases for which is not possibleto derive a likelihood, and where traditionally moment based inference is used. Theaim of this talk is to show some asymptotic results for moment based estimatorsincluding consistency and efficiency. We investigate the validity of the asymptoticresults for finite samples using simulations, the particular processes chosen are froma class of models for rainfall based on point-processes which are widely present inrainfall modeling literature, and are also used by official bodies like the UK ClimateImpacts Programme.
Start time 10:00
MAXIMUM LIKELIHOOD ESTIMATION OF DISCRETE
DISTRIBUTION PARAMETERS USING RFiona McElduff, Mario Cortina-Borja and Angie Wade
Centre for Paediatric Epidemiology and Biostatistics, Institute of Child Health, UCL.
Keywords: discrete distributions, maximum likelihood estimation, rapid estimation
43
Value inflation, truncation and overdispersion frequently appear in discrete datasets.The most widely used model for discrete data is the Poisson distribution, but inpractice the equal mean-variance assumption is often not supported by the obser-vations. Many probability distribution functions have been developed to improvemodelling highly skewed variables. It is of interest to fit several models correspond-ing to competing data-generating mechanisms hypotheses. We have developed anR library to fit a comprehensive range of probability distributions to discrete datausing maximum likelihood estimation. The library includes models characterised asparameter-mix Poisson distributions and members of the Lerch and generalized Hy-pergeometric families, as well as their modified versions, e.g. those incorporatingvalue-inflation and truncation. Models are compared using the BIC. We apply thismethodology to several datasets within the field of child health research.
Start time 10:25
INVESTIGATING THE IMPACT OF MISSING DATA ON
CRONBACH’S ALPHA ESTIMATES AND CONFIDENCE
INTERVALSEmmanuel Ogundimu
University of Warwick, UK
Cronbach’s alpha is widely used to describe reliability of tests and measurements.Point estimates of Cronbach’s alpha are readily computed by statistical software, andmethods for constructing confidence intervals have also been suggested in the lit-erature. However, both point estimates and confidence intervals of Cronbach’s al-pha can give misleading results when data is missing. We demonstrate in a MonteCarlo study the impact of missing data on point estimates and confidence intervalsfor Cronbach’s alpha when items in tests have homogeneous or heterogeneous co-variance, and when an underlying normality assumption holds or is violated for testitems. In particular, we assess the coverage rates of Cronbach’s alpha Exact, Nor-mal theory (NT) and Asymptotic Distribution Free (ADF) intervals. Four methods ofimputing missing items scores were evaluated. Finally, we recommend the ‘best’ im-putation techniques for test developers to use when their data falls within scenariosdescribed in this study.
Start time 10:50
POSETS, MOBIUS FUNCTIONS AND TREE-CUMULANTSPiotr Zwiernik
University of Warwick, UK
Keywords: partially ordered sets, cumulants, model identifiability, bayesian networks withhidden variables, phylogenetic tree models, binary data
44
It has been noted by several authors that in the case of multivariate models cumulantsoften form a convenient system of coordinates. We investigate Bayesian networks onrooted trees where all variables in the system are binary and the inner nodes repre-sent hidden variables. We show that in this case we can construct a more flexiblechange of coordinates. This change depends on classical results in the theory of par-tially ordered sets, which mirrors the combinatorial definition of cumulants.The new coordinates give us a good understanding of the structure of the modelsunder consideration. The nice structure of the parameterization allows us for exam-ple to understand the identifiability issues for this class of models: the formulae forthe estimators in the case when the model is identified and the structure of the MLEfibers in the case when it is not.
12.1.5 Session 2a: Medical Statistics I
Session Room: MS.01Chair: Mouna Akacha
Start time 11:30
MODELLING BLOOD GLUCOSE CONCENTRATION FOR
PEOPLE WITH TYPE 1 DIABETESSean Ewings
University of Southampton, UK
Type 1 diabetes mellitus is a chronic metabolic disorder which affects millions of peo-ple worldwide. It is characterised by loss of insulin-production mechanisms whichresults in prolonged high blood glucose concentration (hyperglycaemia). Day-to-day treatment is the responsibility of the individual and is based on injections ofinsulin. Insulin requirements are assessed daily according to various lifestyle fac-tors, predominantly diet and exercise. Poor control of the illness is associated withmany short- and long-term health complications such as ketoacidosis, cardiovascularevents (heart disease, stroke) and neuropathy. Diabetes UK (DUK) currently supportsa three-year study to investigate and model the effect of physical activity on capillaryblood glucose concentration. Volunteers to the study have blood glucose and exercise(as metabolic equivalent of task, MET) recorded continuously over a number of days.Food and insulin regimes are also recorded. Previous research provides models forthe action of ingested carbohydrate and injected insulin in the blood. These modelsmay be combined with the information on METs in order to investigate the behaviourof blood glucose concentration. The focus is on a descriptive model that can aid cur-rent treatment and hence limit complications. Currently, various time series modelsincluding the Dynamic Linear Models are investigated.
45
Start time 11:55
METHODS FOR THE ANALYSIS OF ASYMMETRYJoanna Smith
University of Glasgow, UK
Keywords: shape analysis, asymmetry, landmarks
There is interest in knowing the extent of asymmetry present in the breasts of pa-tients who have undergone a unilateral mastectomy and reconstruction procedure.Three-dimensional images were captured for 44 such patients, and each case wasthen marked with ten anatomically significant landmarks. Asymmetry can be quan-tified as the degree to which there is a mismatch between a landmark configuration(the set of all landmarks on an individual image) and its relabelled and matchedreflection. After a configuration has been reflected, rotated and scaled to minimisesums of squares distances between corresponding landmarks we should have re-moved any location, orientation and size effects and be left purely with the genuineshape differences. This can be quantified into an asymmetry score for each patient.These asymmetry scores give an indication of the overall asymmetry present in acase, however it is also possible to examine what factors are contributing to this asym-metry as well. We can assess how much of the asymmetry that is present is due to thelocation, orientation and size of the reconstructed breast separately. It follows thatany asymmetry remaining after these transformations is due to a difference in theactual shape of the breasts, or an ‘intrinsic asymmetry’. It is also desirable to examineasymmetry over the whole surface of the breasts, rather than just the landmarks. Inorder to do this, we create a set of comparable points across all breasts, so that they allhave the same number of points which are in corresponding positions. Then, after re-flection, the asymmetry can be quantified by calculating the distances between thesecorresponding points on the reconstructed and unreconstructed breast. The shapedifferences between the two breasts can also be examined by a principal componentsanalysis.
Start time 12:20
MEASUREMENT ERROR CORRECTION OF THE
ASSOCIATION BETWEEN FASTING BLOOD GLUCOSE AND
CORONARY HEART DISEASE - A STRUCTURAL
FRACTIONAL POLYNOMIAL APPROACHAlexander Strawbridge
MRC Biostatistics Unit, Cambridge
Keywords: measurement error, fractional polynomials, regression calibration, epidemiology
46
Some epidemiological variables such as height and weight may be assumed to bemeasured precisely however others such as blood pressure, blood glucose or foodintake may be subject to substantial measurement error.Fractional polynomials are widely used in epidemiological studies to model contin-uous non-linear exposure-response relationships but measurement error can lead toserious bias in the parameter estimates in our models. Regression calibration is an in-tuitive and easily implemented method for modelling the relationship between trueexposure and observed exposure when repeat measurements are available.We show how fractional polynomials and regression calibration can be combined toproduce a model that is corrected for the bias induced by measurement error. Wethen illustrate this method on a dataset looking at the association between fastingblood glucose and the risk of coronary heart disease events and show that measure-ment error may be leading us to underestimate the risk associated with higher thannormal levels of blood glucose.
Start time 12:45
MODELLING THE EFFECTS OF ANTIBIOTICS ON CARRIAGE
LEVELS OF MRSAEleni Verykouki
University of Nottingham, UK
Keywords: Markov Models, Maximum Likelihood, MCMC
Methicillin-Resistant Staphylococcus Aereus (MRSA) is a bacterium that is usuallyfound on the skin and in the nose. Once it enters the body it becomes harmful asit is resistant to antibiotics and is one of the most serious causes of nosocomial andsurgical site infections. In the project we are interested in assessing the effect of an-tibiotics of MRSA on data taken from a hospital study in London. A discrete-timeMarkov chain model is used to describe the daily MRSA carriage level in patients.Frequentist and Bayesian inference for the model parameters is drawn via maximumlikelihood and MCMC methods respectively. We validate our methodology usingsimulated data and then we fit our model to the real data (obtained from the abovestudy). Finally, we discuss how chi-square tests can be used to assess the goodnessof fit.
47
Start time 13:10
PLANNING FUTURE STUDIES BASED ON THE
CONDITIONAL POWER OF A RANDOM-EFFECTS
META-ANALYSISVerena Roloff and Julian Higgins
MRC Biostatistics Unit, Cambridge, UK
Keywords: Random-effects meta-analysis, conditional power, sample size, information size,heterogeneity
Systematic reviews like those produced by The Cochrane Collaboration often providerecommendations for further research. When meta-analyses are inconclusive, suchrecommendations typically argue for further studies to be conducted. However, thenature and amount of future research should depend in the nature and amount of theexisting research. We propose a method based on conditional power to make theserecommendations more specific. Assuming a random-effects meta-analysis model,we evaluate the influence of the number of additional studies, of their informationsizes and of the heterogeneity anticipated among them on the ability of an updatedmeta-analysis to detect a pre-specified effect size. The conditional powers of possibledesign alternatives can be summarized in a simple graph which can also be the basisfor decision making. An example from literature is used to demonstrate our strategy.We find that if heterogeneity is anticipated, it might not be possible for a single studyto reach the desirable power no matter how large it is.
12.1.6 Session 2b: Financial
Session Room: MS.04Chair: Murray Pollock
Start time 11:30
MODELLING THE RANK SYSTEM WITH GIBBS, BOSE
EINSTEIN OR ZIPF LAW. APPLICATION IN
MATHEMATICAL FINANCETomasz Lapinski
University of Warwick, UK
Rank systems frequently occur in areas such as linguistics, physics, economy andfinance therefore their structure varies significantly. Existing modelling approaches
48
have been developed and introduced separately to meet the needs of the particulardiscipline.However, it turns out, that for the particular rank system, which has not been ex-plored before, we are able to combine the existing approaches and then determinewhich of distributions is the most appropriate: Gibbs, Bose-Einstein or Zipf Law, as-suming that in real life such system obeys the maximum entropy principle.Particularly, this approach could be used in the financial mathematics, for the choiceof optimal portfolio of assets.
Start time 11:55
A MARTINGALE APPROACH TO ACTIVE PORTFOLIO
SELECTIONDaniel Michelbrink
The University of Nottingham, UK
Keywords: active portfolio selection, martingales, expected utility maximisation, geometricBrownian motion
An active portfolio selection problem is considered where an investor is interested inoutperforming a benchmark portfolio. This benchmark can be given, for example, bya stock index.The investor chooses to maximise expected utility from the ratio of his portfolio andthe benchmark. The problem can then be solved using a stochastic control approachor a martingale approach. We will present the latter one.
Start time 12:20
MEASURING VEGA RISKS OF BERMUDAN SWAPTIONS
UNDER THE MARKOV-FUNCTIONAL MODELDuy Pham and Dr. Joanne E Kennedy
Department of Statistics, University of Warwick, UK
Keywords: Markov-Functional, Bermudan swaption, Hedging, vega risks
Markov-Functional (MF) models form a popular class of models in which the valueof pure discount bonds can be expressed as a functional of some (low-dimensional)Markov process. We shall consider a particular application of MF model, pricing andhedging the Bermudan swaptions which are by far the most common in the interestrate derivatives market. Practically, calculation of risk sensitivities for a Bermudanswaption is as important as calculation of its value. In this work, we consider dif-ferent parametrizations of the driving Markov process and their implications on theBermudan swaption’s vega risks.
49
Start time 12:45
MATHEMATICAL AND STATISTICAL MODELS FOR
PREDICTING FINANCIAL BEHAVIOURGolnaz Shahtahmassebi
University of Plymouth, UK
Keywords: Ultra high frequency financial data, Poisson difference distribution,decomposition, Bayes, Markov chain Monte Carlo
In this study we introduce the application of the Poisson difference (PD) distributionto ultra high frequency financial data. To investigate the behaviour of index change,PD models were implemented in a Bayesian framework via the Markov chain MonteCarlo (MCMC) methods. In order to capture the excess of zero counts in the data, thezero-inflated distribution is used. In addition, a decomposition (ADS) model, whichdecomposes an index change into three components: index activity, direction andsize of the index change, was also considered using the Bayesian approach. Both ofthe models predicted the index change with a reasonable degree of accuracy. How-ever, the PD model might be easier and less time consuming to implement in onlineapplications, e.g. making predictions. The Gelman convergence diagnostics showeda good convergence of the chains in the case of both the ADS and PD models.
Start time 13:10
AN OPTIMAL STOPPING PROBLEM OF FINITE HORIZON
WITH REGIME SWITCHINGChun Wang
School of Mathematical Sciences, The University of Nottingham, UK
Keywords: optimal stopping, regime switching, supermartingale
We study a class of finite-horizon optimal stopping problems under regime switch-ing models by considering a series of optimal stopping problems and its limit. Theapplication of this problem includes the pricing of American put options where thestock price evolves as a regime switching geometric Brownian motion. The construc-tion involved will naturally lead to a computational procedure for which a numericalexample also is provided.
50
12.1.7 Session 2c: Elicitation and Epidemiology
Session Room: MS.05Chair: Michelle Stanton
Start time 11:30
ON ELICITING EXPERT OPINION IN GENERALIZED
LINEAR MODELSFadlalla G. Elfadaly and Prof. Paul H. Garthwaite
The Open University, UK
Keywords: Elicitation Methods, Prior Distributions, Generalized Linear Models,Interactive Graphical Software
An important assessment task in Bayesian analysis of generalized linear models (GLMs)is to specify an informative prior distribution for the model parameters. Suitable elic-itation methods play a key role in this specification by obtaining and including expertknowledge as a prior distribution.An elicitation method of quantifying opinion about any GLM was developed inGarthwaite and Al-Awadhi (2006). The relationship between each continuous pre-dictor and the dependant variable (assuming all other variables are held fixed) wasmodeled as a piecewise-linear relation. The regression coefficients of this relationwere assumed to have a multivariate normal distribution. However, a simplifying as-sumption was made regarding independence between these coefficients, in the sensethat regression coefficients were a priori independent if associated with different pre-dictors.In this current research we relax the independence assumption between coefficientsof different variables. This will significantly increase the range of situations wherethe method is useful, but it means that the variance-covariance matrix of the priordistribution is not necessarily block-diagonal. A method of elicitation for this morecomplex case is given and it is shown that the resulting variance-covariance matrixis positive-definite.The method was designed to be used with the aid of interactive graphical software,which is being revised and extended further in this research to handle the case ofGLM with correlated pairs of covariates.
Start time 11:55
DISCORDANCY BETWEEN THE PRIOR AND DATA USING
CONJUGATE PRIORSMitra Noosha
Queen Mary University of London
51
In Bayesian Inference the choice of prior is very important to indicate our beliefs andknowledge. However, if these initial beliefs are not well elicited, then the data maynot conform to our expectations. The degree of discordancy between the observeddata and the proper prior is of interest. Pettit and Young (1996) suggested a BayesFactor to find the degree of discordancy. I have extended their work to further exam-ples.I try to find explanations for Bayes Factor behaviour. As an alternative I have lookedat a mixture prior consisting of the elicited prior and another with the same mean buta larger variance. The posterior weight on the more diffuse prior can be used as ameasure of the prior and data discordancy and also gives an automatic robust prior.I discuss various examples and show this new measure is well correlated with theBayes factor approach.
Start time 12:20
INDIAN BUFFET EPIDEMICS
A BAYESIAN APPROACH TO MODELLING
HETEROGENEITYAshley P. Ford and Gareth O. Roberts
University of Warwick, UK
Keywords: Epidemic, MCMC
The application of mathematical and computer models to the prediction of epidemicsin real time is often lacking the crucial stage of statistical inference. There is a needfor techniques of inference on models which lie between the extremes of over simpli-fication and too complex for inference.The Indian Buffet Epidemic model has been developed to address the need for amodel which is more suitable than assuming homogeneous mixing or an incorrectnetwork model. The aim is to have a process which fits the heterogeneity and two orthree parameters that measure the departure from homogeneity.The Indian Buffet Epidemic combines a bipartite network model with the Indian Buf-fet process to provide a realistic model which is simple to define and simulate from.The model assumes that there are a large number of potential classes, individualsbelong to a subset of these classes. The classes might be households, schools, clubs,etcetera, an important feature of this new class of models is that the classes do notneed to be specified. Within each class infection occurs homogeneously and recoveryis as in the basic SIS or SIR model.The model is descibed along with an MCMC algorithm for deriving parameter esti-mates. An important aspect is the development of a new proposal distribution forlarge binary matrices. The algorithm is demonstrated on a range of simulated datafrom both the true model and other epidemic models and comparisons made be-tween centered and non-centered representations for the augmented data.
52
Start time 12:45
A HIDDEN MARKOV MODEL TO ANALYSE MRSATRANSMISSION IN HOSPITAL WARDS
Colin WorbyUniversity of Nottingham, UK
Keywords: hidden Markov model, epidemiological model, MRSA
Methicillin-resistant Staphylococcus aureus (MRSA) remains a problem in healthcareinstitutions in the UK and worldwide, causing serious, sometimes life-threatening,infections with limited treatment options. For this reason there is much emphasis onthe prevention of transmission, for example through the isolation of known cases inside rooms or cohorts, and the use of contact precautions such as disposable gownsand gloves. However, there is still much debate over the efficacy of individual controlmeasures. We use hospital data collected from a selection of general medical wards,and create a model describing MRSA transmission dynamics amongst patients, withthe aim of estimating the effectiveness of hospital infection control strategies. A hid-den Markov model is used to describe the indirectly observed MRSA transmissionprocess, accounting for the fact that screening to detect MRSA presence is not 100%accurate. This framework allows us to analyse how the probability of a patient ac-quiring MRSA is related to ward prevalence, and how effective isolation and de-colonisation measures are in reducing transmission. The study confirms a reductionin transmission due to the combined effect of isolation and decolonisation treatment.While side room isolation is widely used in controlling the spread of nosocomialpathogens, we found no evidence to suggest physical isolation, through moving pa-tients to a single room, significantly reduces transmission potential in comparison toisolation methods on the open ward.
Start time 13:10
ESTIMATING THE SIZE OF A BADGER POPULATION USING
LIVE CAPTURE AND POST-MORTEM DATANeil Walker1, Dez Delahay1 and Prof Peter Green2
1 Fera, Woodchester Park, Stonehouse, Glos.2 Maths Dept, University of Bristol
Keywords: Bayesian, mark-recapture, autocorrelation, population size
Woodchester Park in Gloucestershire has been the site of an intensive mark-recapturestudy on a local badger population study since 1975. We consider methods of popula-tion size estimation using these data supplemented by information on post-mortem
53
recoveries. Of particular relevance is the integrated approach advocated by Catch-pole et al (1998) - this is applied in a Bayesian context. In addition, we look at idiosyn-cracies in the data and possible extensions therein, for example temporal autocorre-lation in the capture, survival and recovery parameters. Finally, the performance ofdifferent models is considered and we discuss possible reasons for these differences.
12.1.8 Session 2d: Multivariate Statistics
Session Room: A1.01Chair: Nathan Huntley
Start time 11:30
CAUCHY PRINCIPAL COMPONENTS ANALYSISAisha Fayomi and Prof. Andy Wood
University of Nottingham, UK
Keywords: Principal Components Analysis, robust statistical techniques, Cauchy likelihood
Robust methods are highly relevant in multivariate statistical analysis. Many dif-ferent robust methods have been developed to cover the needs of numerous otherfields. Principal components analysis (PCA) is considered as one of the most impor-tant techniques in statistics. However, it depends on either a covariance or a correla-tion matrix, which are both very sensitive to outliers. From this point of view, it wasour thought to develop an alternative method to classical PCA, which is more robust,by using the Cauchy likelihood function to construct a robust principal componentsprocedure.
Start time 11:55
APPROXIMATE JOINT STATISTICAL INFERENCE FOR
LARGE SPATIAL DATASETSJames Sweeney and John Haslett
Trinity College Dublin, Ireland
Keywords: Multivariate nonparametric regression, Palaeoclimate reconstruction, Inverseproblems
We propose an approximate sequential approach for inferring the correlation matrixin large multivariate spatial regression problems. This enables the decomposition ofthe computationally intensive, multivariate, ”joint” problem, into a set of indepen-dent univariate problems with possible correlation structure inferred sequentially.Omission of correlation structure (where inappropriate) in potential models will lead
54
to increased uncertainty in the degree of confidence at the reconstruction stage of anassociated inverse problem.
The results from the proposed sequential approach are compared to those obtainedusing the (correct) full joint approach through the comparison of bias and predictiveproperties for simulated and palaeoclimate data. Inference procedures used are Em-pirical Bayes (EB) based where the hyperparameters governing a given model areconsidered as unknown fixed constants.
Start time 12:20
MULTIVARIATE OUTLIERS, THE FORWARD SEARCH AND
THE CRONBACH’S RELIABILITY COEFFICIENTMichael Tsagris
University of Nottingham, UK
Keywords: multivariate outliers, Forward search, Cronbach’s alpha
The multivariate outliers are of very interest due to the nature of the data. Whilein the univariate case, things are straightforward, when moving to more than onevariables things can be very difficult. In this work, multivariate outlier detectionmethods are discussed and the Forward search is also implemented. The robust es-timates of scatter and location is the key feature for the detection of outliers. Finally,the Cronbachs reliability coefficient is discussed and applied to the Forward searchas a monitoring statistic.
Start time 12:45
BAYESIAN ANALYSIS IN MULTIVARIATE DATARofizah Mohammad and Dr. Karen Young
University of Surrey, UK
Keywords: Model choice, Bayes factors, Classification, Discriminant analysis, Influentialobservations
In this presentation we will be considering a Bayesian approach to model selectionin multivariate normal data using the Bayes factor, similar to that used by Spiegel-halter and Smith (1982). We are particularly interested in classifying observations,when we know that they come from different populations. We shall compare clas-sical techniques of linear and quadratic discriminant functions with a new Bayesianapproach. We are interested in looking at the effect of observations on this classifica-tion. One diagnostic to determine the effect of observations on a Bayes factor is kd,which is used to assess the effect of individual observations on model choice, Pettitand Young (1990).
55
Start time 13:10
SOME ASPECTS OF COMPOSITIONAL DATAFiona Sammut
University of Warwick, UK
Keywords: Multivariate Constrained Data
A composition X is a D-vector, whose components X1, . . . , XD satisfy a sum con-straint, that is, X1 + . . . + XD = c, where c may be equal to 1, 100, 106 or any otherconstant, depending on unit of measurement. Due to its nature, compositional dataconveys only relative information, the elements are always zero or positive and onepart of the composition may always be written in terms of the remaining parts. Datais thus not free to range as unconstrained variables encountered in traditional multi-variate analyses. This fact conditions the variance covariance structure in that at leastone covariance is forced to be negative. In general, analyzing compositional datawith methods which are based on the variance covariance or correlation structurelike factor analysis, discriminant analysis and principal component analysis wouldlead to incorrect results. It was thus necessary to find some parametric class of dis-tributions which could cater for the dependence structure between the parts of thecompositions but which could also make the transition from the simplex (the spaceof compositional data) to the whole real line possible. A possible approach to such asituation is based on logratio transformations which provide a one to one mappingfrom the simplex to the real space, removing the problem of having to work withina constrained sample space. Such a transformation then makes it possible to applythe standard multivariate techniques on the transformed compositional data. A ma-jor shortcoming which is common to all logratio transformations, however, is that ifsome parts of a composition are zero, the corresponding logratios may not be com-puted. Different strategies had to be developed in attempt to deal with this problem.
12.1.9 Session 3a: Genetics
Session Room: MS.01Chair: Dennis Prangle
Start time 14:30
INCORPORATING AVAILABLE BIOLOGICAL KNOWLEDGE
TO EXPLORE GENOME-WIDE ASSOCIATION DATAMarina Evangelou
MRC-Biostatistics Unit, University of Cambridge, UK
Keywords: Genome-wide association studies, Pathway-based analysis
56
The evolution of the science of genetics and the development of genotyping technolo-gies have made genome-wide association studies (GWAS) feasible. GWAS have beensuccessful in identifying SNPs that are significantly associated with various complexdiseases, but they do not have the required power to detect small effects of SNPs thatare known to be biologically associated with the disease. Our research focuses on theexploration of genome-wide association data using pathway-based analysis. Pathway-based analysis is a joint test of association between a group of SNPs/ genes within aknown biological pathway and the outcome (which can be either a binary responsevariable or a continuous one). Pathway-based approaches have the advantage of in-corporating the available biological knowledge of SNPs and genes and therefore havea better chance of identifying the true model of association.Our genome-wide association study aims to identify the relationship between ge-netic loci and platelet function. Platelets, which play an important role in thrombusformation, are rapidly activated by a range of agonists like collagen and ADP. Thisstudy involves a cohort of 500 healthy individuals for each of whom four endpointswere measured: fibrinogen and p-selectin responses to ADP and collagen agonistsin order for platelet function to be determined. It is believed that a large number ofgenes with small effects is associated with platelet function and we are aiming to findthis by implementing approaches to pathway analysis.
Start time 14:55
INFORMED BAYESIAN CLUSTERING OF GENE EXPRESSION
LEVELSAnna Fowler
Imperial College London, UK
Keywords: Bayesian Hierarchical Clustering, Variable Selection, Gene Expression Levels
Single Nucleotide Polymorphisms (SNPs) occur when there is a variation in the DNAsequence at one of the nucleotide bases. This can cause differences in the proteinsproduced and therefore alter the actions of the cell. HLA-DQA proteins play an es-sential role in the immune system by presenting antigens to a specific group of whiteblood cells (T cells) to enable them to produce the antibodies needed. The data weare analysing are part of the HapMap project and consist of genotype labels for threeSNPs which cause the 116 subjects to produce different amounts of the HLA-DQAprotein. There are also gene expression levels for each subject, which indicate thelevel of production for the proteins associated with each gene. It is the immune sys-tem which is primarily of interest here, and of the 3538 measured genes very fewproduce proteins which are related to immunity. Identifying the significant genesis complicated by the dimensionality of the data and has been approached in manyways recently.Two-way Bayesian hierarchical clustering allows clusters to form over both genesand subjects, revealing the underlying block-like structure of the data. Genes which
57
are related to the immune system are more likely to be co-regulated with the SNPgenotypes than those which are not. Therefore, the clustering of the subjects andtheir genotypes will influence the clustering of the genes which are related to the im-mune system significantly more than the clustering of those which are not. Hence,by applying a novel method of two-way clustering only over the genes which ben-efit significantly from this additional information, we seek to determine which geneclusters are co-regulated with the production of the HLA-DQA proteins and identifythese genes as the variables associated with the immune system.
Start time 15:20
AN APPLICATION OF BAYESIAN TECHNIQUES FOR
MENDELIAN RANDOMIZATION TO ASSESS CAUSALITY IN
A LARGE META-ANALYSISStephen Burgess and Simon G. Thompson
MRC Biostatistics Unit, University of Cambridge
Keywords: Genetic epidemiology, Mendelian randomization, Causality, Meta-analysis,Bayesian methods
The determination of causality from observational data is historically a controver-sial question. Observational relationships between a risk factor and an outcome areaffected by confounding and reverse causation. Mendelian randomization is a tech-nique whereby genetic information is used analogously to randomization in a ran-domized control trial. Under certain assumptions, genetic information can give in-sight to the nature and direction of a causal association. Genetic variation in a riskfactor is determined at birth, so is causally prior to any event, and is allocated ran-domly in population groups, meaning that subgroups differing in genetic variantswith a specific effect on the risk factor of interest will not systematically differ in otherfactors. We show how novel Bayesian techniques can be applied to a large dataset,comprising over 100 000 participants in over 30 different studies measuring over 20different genetic variants, to assess the causal association of C-reactive protein oncoronary heart disease.
58
Start time 15:45
BAYESPEAK: A HIDDEN MARKOV MODEL FOR
ANALYSING CHIP-SEQ EXPERIMENTSJonathan Cairns1, Christiana Spyrou4, Andy Lynch1, Rory Stark3 and Simon Tavare1
1 Department of Oncology, University of Cambridge, Li Ka Shing Centre,Cambridge, UK
2 DAMTP, Centre for Mathematical Sciences, Wilberforce Road, Cambridge, UK3 Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre,
Cambridge, UK4 MRC Clinical Sciences Centre, Faculty of Medicine, Imperial College London,
UKKeywords: Bayesian Inference, Hidden Markov Model, Gibbs Sampling,
Metropolis-Hastings, Negative Binomial, Oncology, ChIP-seq
Accurate identification of interactions between proteins and DNA is a key element inunderstanding the mechanisms that lead to cancer. The biological experiment ”ChIP-seq” is used to investigate sites on the chromosome where proteins bind, often acti-vating or silencing a particular gene.The data presents itself as ”peaks” across the chromosome. However, various techni-cal or biological effects can lead to noise, disguising true peaks and even generatingfalse peaks.Hidden Markov Models (HMMs) have applications in this biological setting. We canuse the hidden state to indicate a binding site, and choose a model that reflects theexpected biological features of the signal.”BayesPeak” is an MCMC algorithm we have developed to solve this problem, usinga Bayesian approach and based on negative binomial emissions. I will be discussingthe statistical issues we face when fitting our theoretical model to large data sets.
12.1.10 Session 3b: Medical Statistics II
Session Room: MS.04Chair: Helen Thornewell
Start time 14:30
DESIGNING A SERIES OF PHASE II TRIALSSiew Wan Hee
Warwick Medical School, University of Warwick, UK
59
In some diseases with very small population, the number of patients eligible for clini-cal trial is limited. When the development of new therapies increases relatively fasterthan the recruitment of patients there is a need to identify a promising treatment asquickly as possible. A design that requires fewer patients will require less time toidentify a treatment for further testing in phase III trial. Some authors (Whitehead,1985 and Yao et al, 1996) have proposed considering a series of clinical trials whereeach trial tests a treatment that is different from the others. There is a trade-off be-tween large trials which require many patients and small trials which may yield littleinformation, particularly if there is a high start-up cost. We propose a design thatis a hybrid of classical frequentist and Bayesian where the traditional analysis at theend of the trial is based on the conventional frequentist hypothesis testing and theBayesian method is used to maximize the power of the series of trials. Designs areobtained optimise the number of patients and power for each trial in a series. Thetotal number of patients eligible for trial and the type I error (which is declaring thetreatment as effective when it is not) are fixed and a start-up cost is included.
Start time 14:55
RESPONSE-ADAPTIVE BLOCK RANDOMIZATION IN
BINARY ENDPOINT CLINICAL TRIALSDominic Magirr
Lancaster University, UK
Keywords: Clinical trials, Adaptive design
The results of a clinical trial will typically accumulate steadily throughout its dura-tion. Response-adaptive randomization (RAR) uses the accumulating data in orderto skew the randomization of remaining patients to treatment groups in favour ofthe current better performing treatment. The aim is to reduce the number of patientsreceiving inferior treatment. RAR has rarely been used in practice. One example is atrial of extra-corporeal membrane oxygenation (ECMO) to treat newborn infants withrespiratory failure. The results of the trial were controversial in large part becauseonly one patient received control therapy. In this talk the ECMO trial is described.Alternative RAR designs are proposed that incorporate random permuted blocks inorder to eliminate the possibility of such an extremely unequal allocation ratio.
Start time 15:20
BAYESIAN CLINICAL TRIAL DESIGNS FOR SURVIVAL
OUTCOMESShijie Ren
University of Sheffield, UK
Keywords: Assurance, Survival outcome
60
When designing a clinical trial, sponsors or decision-makers may only consider thepower of the trial, i.e. the conditional probability of a successful trial assuming aspecified treatment effect. Since the treatment effect is uncertain, this will not providea reliable assessment of the probability of a successful outcome and can often give amisleading impression of the likely outcome of the trial. As an alternative to usingpower, one can consider the unconditional probability of a successful trial outcomeknown as assurance. This involves utilizing prior information about treatment effectsin the design of the trial. We consider how to derive assurance when a trials outcomemeasure is survival time. We allow for uncertainty in both treatment effect and thecontrol group survival function.
Start time 15:45
THE POWER OF THE BIASED COIN DESIGN FOR CLINICAL
TRIALSWai Yin Yeung
Queen Mary, University of London, UK
Keywords: biased coin design, clinical trials, sequential patient allocation
The biased coin design introduced by Efron (1971, Biometrika) is a design for allocat-ing patients in clinical trials which helps to maintain the balance and randomness ofthe experiment. Chen (2006, Journal of Statistical Planning and Inference) studied thepower of repeated simple random sampling and the biased coin design in which thepower is treated as the conditional probability of correctly detecting a treatment effectgiven the current numbers of patients on the two treatments, the control group andthe treatment group. The variances of the responses for the two groups are assumedto be equal. The z test and the t test for a treatment effect are used to demonstrateand analyse the power function when the variances of the treatment responses areknown and unknown, respectively. Numerical results given in his paper showedthat the biased coin design is uniformly more powerful than repeated simple ran-dom sampling.In this talk, I shall report on my current work which extends Chen’s on the power tothe case where the variances of the responses for the two treatments are assumed tobe different. I will give numerical results for the powers of repeated simple randomsampling and the biased coin design when the variances are known and different;and also when they are unknown and different.
61
12.1.11 Session 3c: Dimension Reduction
Session Room: MS.05Chair: James Sweeney
Start time 14:30
ORACLE PROPERTIES OF LASSO-TYPE METHODS IN
REGRESSION PROBLEMSSohail Chand
School of Mathematical Sciences, University of Nottingham, UK
Keywords: Variable Selection, Lasso, LARS, Oracle properties.
In model building, we often have a large set of predictors. As all the variables arenot equally important for the model, we seek a parsimonious model. Parsimoniousmodels are very important for prediction purposes as overfitted models have higherprediction variance. In practice, it is often quite difficult to find a model which isa good fit as well as easy to interpret. As discussed by Fan and Li (2001, JASA96(456):1348-1360), a good estimation procedure should have the oracle properties,namely variable selection consistency and the optimal estimation rate. Lasso-typemethods in the regression context are popular for their simultaneous estimation andvariable selection. Our numerical results show in some scenarios how normalisationof the predictors can nullify the advantage of using the adaptive weights and maylead to failure of the necessary and sufficient condition for correct subset selection.The choice of the regularisation parameter is critical for the oracle performance ofthese methods. We have compared the performance of cross validation with theWang and Leng (2009, J Roy Stat Soc B Met; 71(3):671-683) BIC approach in choosingthe appropriate value of regularisation parameter. Our results show that the crossvalidation choice of regularisation parameter may lead to inconsistent variable selec-tion.
Start time 14:55
PENALIZED WEIGHTED LEAST SQUARES VARIABLE
SELECTION METHOD FOR AFT MODELS WITH HIGH
DIMENSIONAL COVARIATESMd. Hasinur Rahaman Khan and J. Ewart H. Shaw
University of Warwick, UK
Keywords: AFT model, Penalized Regression, Variable Selection, Weighted Least Squares
62
Although, in recent years penalized regression methods have received a great deal ofattention for simultaneous variable selection and coefficient estimation particularlyin the analysis of high-dimensional datasets, only small number of methods basedon penalized approaches have been suggested for survival datasets. Here we lookat a new penalized approach, based on weighted least squares, for model estima-tion and variable selection in parametric accelerated failure time (AFT) models. Weapplied this approach for Log-Normal AFT model with both low-dimensional andhigh-dimensional datasets. This approach improves predictive accuracy which is animportant inferential goal in survival analysis while dealing with variable selectiontechniques. The performance of this approach is demonstrated with simulated exam-ples and real datasets where time to survival, in the presence of right censoring, is ofinterest.
Start time 15:20
LATENT VARIABLE MODELS FOR PROCESS MONITORINGJavier Serradilla and Dr. Jian Q. Shi
Newcastle University, UK
Keywords: Multivariate Statistical Process Control, Latent Variable Models, ProbabilisticPCA
Fault detection and diagnosis in manufacturing process are a key aspect in currentgood engineering practice. Statistical approaches to fault detection based on histori-cal operating data have been found to be advantageous with processes having a largenumber of measured variables. These models, however, tend to underperform in thearea of fault diagnosis, where the variable(s) responsible for the plant abnormal be-haviour must be identified.In this presentation we intend to review how latent variable models can be used bothto reduce the data dimensionality and form subgroups of variables. These new vari-ables are then used for process monitoring. The added advantage of the approach isthat each latent variable will be selectively looking at a specific and well defined sub-set of the original variables. Likewise, fault detection is quicker as the confoundingeffect of redundant variables is eliminated.
Start time 15:45
A STUDY OF ITEM SELECTION USING PRINCIPAL
COMPONENT ANALYSIS AND CORRESPONDENCE
ANALYSISNur Fatihah Mat Yusoff
National University of Ireland, Galway
Keywords: item selection, principal component analysis, correspondence analysis
63
This study investigates the dimension-reduction techniques in psychometric testingby using Principal Component Analysis (PCA) and Correspondence Analysis (CA).Psychometric research is one of the fields of social science study that is interested inthe theory and techniques of education and psychological measurement. Researchersin this area are frequently concerned with the construction and validation of measure-ment instruments. Theoretically, PCA is a mathematical algorithm that transforms anumber of possibly correlated variables into a smaller number of uncorrelated vari-ables by performing a covariance analysis between variables. The PCA concept isclosely related to Factor Analysis (FA) which aims to detect structure in the rela-tionships between variables. It is a common technique that has been used by socialscience researcher in conducting validity and reliability analysis of their study. TheCA can be considered as a factor method for the categorical variables and is oftenlinked with producing a low-dimensional graphical display of variables and units.Simple CA is a technique designed to analyse a two-way table, while Multiple Cor-respondence Analysis (MCA) is an extension of simple CA in that it is applicable toa large set of variables. The result will provide information which is similar in na-ture to those produced by principal component analysis, and allows us to explore thestructure of the categorical variables included in the table.This study is concerned with reducing the dimension, or number of variables, in aninstrument by using the data from a pilot study on personality traits. The originalinstrument was developed by Oliver P. John and Sanjay Srivastava from Universityof California, Barkeley in 1999. The pilot survey was conducted at the UniversityMalaysia Sarawak, Malaysia where 80 students from second year and above wererandomly selected as respondents. In the original instrument, there are 44 items toassess five personality traits or the big five dimensions. We believe that some of theitems, or even dimensions are not relevant in the Malaysian context. At the end ofthis study, our aim is to produce the best instrument that can represent all of the vari-ables that we are interested in for subsequent use in structural equation modelling ofstudent achievement.
64
12.1.12 Session 3d: Environmental
Session Room: A1.01Chair: Andrew Smith
Start time 14:30
USING A BAYESIAN HIERARCHICAL MODEL FOR
TREE-RING DATINGEmma M. Jones1, Caitlin E. Buck1, Clifford D. Litton2, Cathy Tyers1 and Alex
Bayliss3
1 University of Sheffield, UK2 University of Nottingham, UK
3 English Heritage, UKKeywords: Dendrochronology, Bayesian hierarchical modelling
Dendrochronology, or tree-ring dating, uses the annual growth of tree-rings to datetimber samples. Variation in ring width is determined by variation in the climate.Trees within the same geographical region are exposed to the same climatic signal ineach year, but the signal differs from year to year.Dendrochronologists measure sequences of tree-ring widths with a view to datingsamples by matching undated sequences to dated sequences known as ‘master’ chronolo-gies. The tree-ring widths from undated timbers are measured and the data are pro-cessed to remove growth trend. The processed data are sequentially matched againstone another, each match position is known as an offset; initially matching timbersfrom the same site or woodland and then matching average sequences from each siteor woodland, known as ‘site’ chronologies, to master chronologies.The hierarchical nature of the data leads to modelling the data using a Bayesian hi-erarchical model. The ring-width for tree j in year i is modelled as the sum of theclimatic signal in year i and a random noise which is particular to a tree j in year i.This model can be extended to include climatic signals at varying geographic scales.A Gibbs sampler is used to produce posterior probabilities for a match at each offset.This methodology relies on careful prior specification of parameters at each level ofthe hierarchy. Data are currently being collated from trees of known age from severalwoods in the UK that will be used to provide informative prior knowledge.
Start time 14:55
NOT ANOTHER SPECIES RICHNESS ESTIMATOR?!Beth Norris
University of Kent, UK
Keywords: Statistical ecology, Species richness estimation
65
One of the oldest and most intuitive measures of biodiversity is species richness,which is simply the number of species present in an area of study. Sampling frompopulations will rarely give a complete inventory of species and therefore severalmethods have been developed in order to estimate the true species richness of a pop-ulation from sample data. There are over 20 different techniques already describedthat will produce an estimate of total species richness, so why do we need another?Species richness estimators often perform badly for benthic data sets. Some stud-ies have suggested that species richness estimation is dependent on spatial patterns,and that the clustered spatial distribution of benthic assemblages hampers incidencebased estimators such as Chao2 and ICE. None of the commonly used species rich-ness estimators considered take into account spatial heterogeneity, and non-parametricestimators often underestimate the total species richness for such data sets.Therefore, an alternative approach has been proposed which relies on modelling theunderlying spatial pattern of individual species. The modelling framework consid-ered is based on the method of maximum likelihood, and fits a parametric modelto observed species abundances. As species heterogeneity factors will be taken intoaccount alongside species abundances, this method should perform well in estimat-ing the true species richness of an area. The method will be assessed by simulation,and will be applied to benthic data sets supplied by Cefas. The estimates will becompared to the results from some established estimators.
Start time 15:20
UNCERTAINTY ANALYSIS FOR MULTIPLE ECOSYSTEM
MODELS USING BAYESIAN EMULATORSRachel Oxlade, Prof. Michael Goldstein and Dr. Peter Craig
University of Durham, UK
Keywords: Bayesian, Bayes Linear, simulator, ecosystem, model, emulation
Bayesian emulation provides a tool for analysing complex simulators. When thereare many parameters over a large input space, and model runs are costly, emulationenables us to approximate the simulator across the space, and gives a measure of ouruncertainty at each point.This talk introduces emulation and then investigates how it can be applied to HadOCC,the Hadley Centre Ocean Carbon Cycle model. The goal of the project is to be able tojointly emulate two simulators of the same system, and this idea will be introducedin the talk.
66
Start time 15:45
ESTIMATING BIOLOGICALLY PLAUSIBLE RELATIONSHIPS
BETWEEN AIR POLLUTION AND HEALTHHelen Powell, Duncan Lee and Adrian BowmanDepartment of Statistics, University of Glasgow, UK
Keywords: Air pollution, Monotonic dose-response relationship, Respiratory health
The effects of air pollution on human health can be estimated using ecological time-series studies, which comprise daily data for the population living within an urbanarea. The responses are daily counts of mortality or morbidity outcomes, which arerelated to air pollution concentrations and other covariates. The majority of studiesestimate a linear relationship between pollution (xt) and health, although a numberhave estimated non-linear dose-response curves g(xt). However, these curves aretypically unconstrained and estimated using smoothing or penalised splines, mean-ing that non-biologically plausible results can occur. For example, for some levelsof pollution the estimated health effects may decrease for increasing concentrations.Therefore, we propose a method for estimating biologically plausible dose-responsecurves, which must satisfy the following properties: (i) increasing monotonicity; (ii)smoothness; and (iii) g(0) = 0, which together enforce the dose-response curve to benon-negative.We applied this approach to data from Glasgow, using counts of respiratory relatedhospital admissions and ozone concentrations. We compared our model with onethat incorporates an unconstrained curve, and found that the latter produced un-realistic results, as the relative risk falls below one and there was a decreasing riskof hospital admissions at high concentrations of ozone. In contrast, the constrainedcurve does not give a relative risk below one for any concentration of ozone, andtherefore does not imply it could be beneficial to your health. This curve was alsobiologically plausible, because increasing ozone concentrations result in increasinghealth risks.
67
12.2 Wednesday 14th April
12.2.1 Session 4a: Medical Statistics III
Session Room: MS.01Chair: Fiona McElduff
Start time 09:10
AN APPLICATION OF SURVIVAL TREES TO THE STUDY OF
CARDIOVASCULAR DISEASEAlberto Alvarez Iglesias1, John Newell2 and Liam Glynn3
1 School of Mathematics, Statistics and Applied Mathematics, NUI, Galway,Ireland.
2 Clinical Research Facility, NUI, Galway, Ireland.3 Department of General Practice, NUI, Galway, Ireland.
Keywords: Recursive partitioning, Survival Trees, Random Survival Forest
Recursive partitioning methods are a popular non-parametric alternative to the clas-sical parametric and non-parametric models in regression, classification and survivalproblems. They have been recognised as a useful modelling tool as they produce amodel that is very easy to interpret. The beauty of these methods lies in their sim-plicity and the relative ease in which the results of the analysis can be explained to aperson with a non statistical background. Single trees are an excellent way to describethe structure of the learning data but their predictive power can be disappointing. Inthe last decade, many efforts have been made to overcome this problem. These meth-ods are generally known as ”ensemble methods” and they use a set of trees, createdby bootstrapping the original data, in order to improve predictibility. The price tobe paid, however, is the absence of a singular tree. In this work, a data set of 1586patients with cardiovascular disease will be analyzed. The primary endpoint wasa cardiovascular composite endpoint, which included death from a cardiovascularcause or any of the cardiovascular events of myocardial infarction (MI), heart failure,peripheral vascular disease and stroke. Seventeen factors/covariates will be consid-ered for development of a prognostic model and the results of different methods forgrowing survival trees will be compared.
Start time 09:35
ANALYSIS OF AN OBSERVATIONAL STUDY TO IN
COLORECTAL CANCER PATIENTSCara Dooley1, John Hinde1 and John Newell2
1 National University of Ireland, Galway2 Clinical Research Facility, National University of Ireland, Galway
68
The aim of the study was to compare survival of colorectal cancer patients in thewhole population against the survival of patients in a sub-population who also hadinflammatory bowel disease (IBD). All individuals who suffered from colorectal can-cer were drawn from the entire Irish population using data from January 1994 toDecember 2005 provided by the National Cancer Registry of Ireland (NCRI).The control group contained many more observations (n > 20000) when comparedto the IBD group (n = 170). Given the number of control patients, there was largediversity in this group. In a conventional designed experiment or trial, patients en-tering the trial would be taken to be as similar as possible. Usually patients wouldbe similar in age, health etc. As this was an observational study, there was no designprior to collecting the data.To compensate for this lack of design, each IBD patient is matched to the ”closest”control patient. For each pair of IBD and control patients a distance is calculated andthose two patients which have the smallest distance between them (and are so are themost similar) are matched. The distance used in this case is a Malanhobis distancebased on ranks. The matching is carried out using the Optmatch Package in R.
Start time 10:00
CAUSAL INFERENCE IN LONGITUDINAL DATA ANALYSIS:A CASE STUDY IN THE EPIDEMIOLOGY OF PSORIATIC
ARTHRITISAidan O’Keeffe
University of Cambridge, UK
Keywords: Causality, Multi-state model, Local Dependence and Independence, PsoriaticArthritis
In any setting when there exists a causal link between two processes or events, thecause must precede its effect. Hence, it seems plausible that a model which aims touncover a causal relationship should account for the passage of time between causeand effect. Longitudinal data are characterised by repeated measurements beingtaken over time on units/subjects, and in this longitudinal setting it appears natu-ral to consider causality. Multi-state models offer a way of describing changes inlongitudinal data over continuous time and it is through the use of such models, inconjunction with important causal concepts, such as composability, local dependenceand local independence and the Bradford Hill criteria, that we shall attempt to infercausality. We use data on the progression to clinical damage in the hand joints ofpatients suffering from the disease psoriatic arthritis (PsA), under observation at theUniversity of Toronto PsA Clinic, in an effort to demonstrate our approach to causalinference. Specifically, we examine the possibility of a causal link between diseaseactivity and clinical damage at the individual joint level.
69
Start time 10:25
DESIGN AND ANALYSIS OF DOSE ESCALATION TRIALSMaria Roopa Thomas
Queen Mary University of London
Keywords: Dose escalation, Cohort effects, Bayesian methods
My research work is motivated by (Senn.et al(2007)). The Royal Statistical Societyestablished an expert group of its own to look into the details of the statistical issuesthat might be relevant to the Phase I First-in-Man TeGenero trial published in theJournal of the Royal Statistical Society Series A. First-in-Man studies aim to find adose for further exploration in Phase II trials and to determine the therapeutic effectsand side effects. Dose escalation trials involve giving increasing doses to differentsubjects in distinct cohorts. One of the recommendations of the RSS working partywas to consider cohort effects. Cohort effects can be influenced by many factors suchas different types of people volunteering at different times, changes in the ambientconditions, the staff running the trial, and the protocols for using subsidiary equip-ment.With reference to (Senn.et al(2007)) four designs for three escalating doses and theplacebo are taken into account. Using WinBugs the cohort effects are fitted and thedesigns are compared.The variance of the difference between the doses are computedusing the WinBugs software.Area of interest are the Bayesian approaches for the design and analysis of dose esca-lation trials which involves prior information concerning parameters of the relation-ships between dose and the risk of an adverse event as well as the desirable effects ofthe drug.There is a chance to update after every dosing period using Bayes theorem.In this talk I will discuss some of these issues.
Start time 10:50
MODELLING PARENTAL DECISIONS FOR NEWBORN
BLOODSPOT SCREENINGStuart Nicholls
Lancaster University, Lancster, UK
Keywords: latent variable, decision-making, screening, model
A national programme of newborn bloodspot screening has been in place in theUK since 1969. Recent advances have expanded the range of conditions for whichscreening is available, with a concomitant increase in the information made availableto parents. There is a lack of research, however, as to how parents make decisionsabout the newborn bloodspot screening. This paper reports the analysis of a postal
70
questionnaire in order to evaluate a proposed model of parental decision-makingfor newborn bloodspot screening. Structural equation modelling was used to assessthe model which showed a good level of fit on several goodness of fit measures aswell as a non-significant χ2 value. Squared Multiple Correlations indicate that a highdegree of variance associated with parental decisional quality is accounted by it’spredictors of attitude towards screening and perceived choice, with an increase inperceived choice leading to a perceived improvement in parental decisions. Trust inthe staff conducting the screening tests was also significantly related to attitudes to-wards screening. This analysis suggests that the proposed model is appropriate. Themodel expands on existing decision-making models suggesting that decisions are af-fected by sociological factors such as perception of choice and trust in staff as wellas rational cognitive elements, such as risk and benefit analyses. This suggests thatexisting measures of parental decision-making and/or informed choice may may beimproved by incorporating these elements.
12.2.2 Session 4b: Point Processes and Spatio-temporal Statistics
Session Room: MS.04Chair: Chris Fallaize
Start time 09:10
POISSON PROCESS PARAMETER ESTIMATION FROM DATA
IN BOUNDED DOMAINPatrice Marek
University of West Bohemia, Czech Republic
Keywords: Poisson Process, Bounded domain, Parameter estimation, Exponentialdistribution, Distance-based methods
In the case where we want to estimate the parameter of the Poisson process that de-scribes some natural phenomenon like earthquakes we usually have to use only onerealization of this process, because it is quite clear that performing repetition is im-possible because these processes are in the hands of the nature. Moreover, we areusually limited by time or finance and therefore we can use only several observa-tions.The approach presented in this paper offers an alternative to the classical distance-based methods presented in the literature. Our approach is based on the estimationof two parameters, the measure of domain and the parameter of the Poisson process.Using this approach we can avoid censoring which would be problematic in the fur-ther research of the spatial Poisson process in the bounded domain.The work has been supported by the grant of Ministry of Industry and Trade of theCzech Republic MPO 2A 2TP1/051.
71
Start time 09:35
A COMPARISON OF BAYESIAN SPACE-TIME MODELS FOR
OZONE CONCENTRATION LEVELSKhandoker Shuvo Bakar
School of Mathematics, University of Southampton, UK
Keywords: Space-time modelling, ozone centrations, auto-regressive model, dynamic linearmodel, Bayesian spatial prediction
Recently, there has been a surge of interest in space-time modelling of ozone con-centration levels. Well known time series modelling methods such as the dynamiclinear models (DLM) and the auto-regressive (AR) models are being used togetherwith the Bayesian spatial prediction (BSP) methods adapted for dynamic data. Asa result, the practitioners in this field often face a daunting task of selection amongthese methods. This paper presents a study comparing three approaches: the DLMapproach of Huerta et al. (2004), the BSP method as described by Le and Zidek (2006),and the AR models proposed by Sahu et al. (2007). Recent theoretical results (Dou etal., 2009) comparing the first two approaches are extended to include the AR mod-els. The results are illustrated with a realistic numerical simulation example usinginformation regarding the location of the ozone monitoring sites and observed ozoneconcentration levels in the state of New York in 2005-2006 for months June and July.The speed of computation, the availability of high-level software packages for imple-menting the methods, and the practical difficulties for using the methods for largespace-time data sets are also investigated.
Start time 10:00
MULTI-LEVEL MODELS FOR ECOLOGICAL RESPONSE
APPLICATIONSIain Proctor2, R.I. Smith1 and Prof. E.M. Scott2
1 Centre for Ecology and Hydrology, Edinburgh, UK2 University of Glasgow, UK
Keywords: Spatial processes, Multi-level models
A problem which occurs often in spatial statistics, is how to represent spatial change.Multi-level models are used for interpreting nested datasets, where various covari-ates are available at differing resolution scales. Used widely in epidemiological stud-ies, this framework is applicable for population studies. In this approach, I will modelthe population trend of carabid communities in upland sites of the United Kingdom.For these locations, environmental variables are measured at the site level; habitatof the surrounding area is defined for each transect, with repeat transect measures
72
at some sites in later years. The setup of these data lends itself naturally to a multi-level model, in which various covariates can be assigned as fixed or random effects.The structure allows one to assign non-Gaussian distributions to the random effects,thereby creating more flexibility in the model.
Start time 10:25
A SPATIO-TEMPORAL MODELLING OF MENINGITIS
INCIDENCE IN SUB-SAHARAN AFRICAMichelle Stanton and Prof. Peter Diggle
School of Health and Medicine, Lancaster University, UK
Keywords: meningococcal meningitis, spatio-temporal, dynamic generalised linear models,
An area of sub-Saharan Africa, known as the meningitis belt, is frequently affectedby large-scale meningitis epidemics resulting in tens of thousands of cases, and thou-sands of deaths during epidemic years. The link between the seasonal and spatialpatterns of epidemics and the climate has long been recognised, although the mech-anisms which cause these patterns are not well understood. The Meningitis Envi-ronmental Risk Information Technologies Project (MERIT) is a collaborative projectinvolving the World Health Organization, and members of the environmental, publichealth and epidemiological communities. One of MERIT’s objectives is to use bothroutine meningitis surveillance data and information on climatic and environmen-tal conditions to develop a meningitis epidemic decision support tool. This decisionsupport tool could then be used to improve the targeting of preventative and reactivevaccine efforts.Weekly meningitis incidence data have been obtained from the Ethiopian Ministryof Health for the period October 2000 to July 2008 at district (woreda) level. Data onthe climate variables most strongly associated with meningitis incidence have beenobtained for Ethiopia over the same time period from the International Research In-stitute (IRI) at Columbia University, New York. We formulate a spatio-temporal dy-namic generalised linear model for incidence and describe how the model can be fit-ted to spatially aggregated incidence data using remotely sensed images of environ-mental and meteorological factors as explanatory variables. The aim of this project isto enable short-term forecasting of district-level incidence as part of the developmentof a country-wide meningitis decision support tool.
Start time 10:50
DENOISING UK HOUSE PRICESAndrew Smith
University of Bristol, UK
Keywords: Nonparametric regression, Penalised regression, Graphs
73
The British people are obsessed with house prices. There is considerable interest inthe difference in price between different areas and in different years. This talk willattempt to show a smooth national trend in house prices, in both space and time.We will look at noisy data, provided by Halifax, on UK house prices and discuss itas a particular example of regression on a graph. There are considerable challengesin the data, most notably the lack of covariate values and missing observations, thatmake existing regression methods fail.Regression on a graph is a new technique that estimates a denoised version of obser-vations made at the vertices of a graph. It is a type of penalised regression, in whichdistance from data is penalised at all the vertices, and roughness at all the edges ofthe graph. These penalty terms present computational challenges, so we will see theresult of a new, fast algorithm for regression on a graph.
12.2.3 Session 4c: General
Session Room: MS.05Chair: Michael Tsagris
Start time 09:10
MIXTURE OF LATENT TRAIT ANALYZERSIsabella Gollini and Thomas Brendan Murphy
University College Dublin, Dublin 4, Ireland
Keywords: Binary Data Models, Latent Variable Models, Mixture Models, VariationalMethods
Latent class analysis and latent trait analysis are two of the most common latent vari-able models for categorical data. Sometimes these models are not sufficient to sum-marize the data, especially when the data comes from a heterogeneous source, thevariables are highly dependent and/or the data dimensionality is large. The mixtureof latent trait analyzers model extends latent class analysis and latent trait analysisby assuming a model for the categorical response variables that depends on both acategorical latent class and a continuous latent trait variable. Fitting the mixture oflatent trait analyzers model is difficult because the likelihood function involves anintegral that cannot be evaluated analytically. We focus on the variational approachthat works particularly well when the dimensionality of the data is large.
Start time 09:35
A WAVELET BASED APPROACH TO HPLC DATA ANALYSISJennifer Klapper and Dr. Stuart Barber
Department of Statistics, University of Leeds, UK
Keywords: Wavelets, High Performance Liquid Chromatography, Vaguelette-Wavelet
74
High Performance Liquid Chromatography (HPLC) is a process by which chemi-cal compounds are separated into their constituent ingredients. The data producedby this type of experiment can be viewed as a time-dependent baseline with inter-mittent peaks. The locations of these peaks indicates which chemicals are presentand the area underneath each peak the quantity of the relevant chemical. Howeverthere are many issues which confound peak identification and quantification, theseinclude the presence of background noise in the data and baseline drift. These prob-lems, amongst others, mean that a certain amount of preprocessing is needed beforethe any type of quantification can take place. We use wavelet denoising techniquesto remove the background noise and eliminate the effects of baseline drift. We subse-quently use vaguelette-wavelet methods to estimate the derivatives of the data andthus locate the peaks within the data. Finally, numerical integration is used to calcu-late the areas under the peaks.
Start time 10:00
DELETE-REPLACE IDENTITY FOR A SET OF
INDEPENDENT OBSERVATIONSSakyajit Bhattacharya1, Brendan Murphy1 and John Haslett2
1 University College Dublin2 Trinity College Dublin
The delete-replace diagnostic method is developed in the context of a general modelof independent observations. If a set of observations is deleted then it is shown tobe estimated by the remaining observations. The identity is shown to be particularlytrue in case of a scalar sufficient statistic.In a multi-parameter case the delete-replace identity holds conditionally. As an ex-ample, the exponential family is explored and delete-replace is shown to be true fora one parameter exponential family. For a curved exponential family the necessaryand sufficient conditions for the delete-replace are derived.The estimate of the set of deleted observations is shown to depend only on the suf-ficient statistic. More particularly, the estimate comes out to be the maximum likeli-hood estimator of the parameter.The delete-replace holds only for independent set of observations. A counter exam-ple is derived for a set of dependent observations where the identity does not hold.
Start time 10:25
MODELLING MAIN CONTRACTOR STATUS FOR THE NEW
ORDERS SURVEYRia Sanderson and Salah Merad
Office for National Statistics, UK
75
In the past, the New Orders survey sampled only main contractors; this populationcould be identified as the main contractor (MC) status was collected from an annualcensus. Following the transfer of construction statistics to the Office for NationalStatistics, the MC status is now collected through the Business Register and Employ-ment Survey (BRES). This change means that, for small businesses, the population ofMCs can no longer be identified (since only a very small number of these businessesare sampled as part of BRES) and hence all small businesses are eligible for selectionby the New Orders survey. One important consideration therefore is non-response,as it is unknown whether the non-response rate will be the same for both MCs andnon-MCs. In order to reduce potential non-response bias, we introduce a calibrationweight which requires an accurate estimate of the number of MCs in the popula-tion. We use data from BRES to build a model, and apply it to every business in thepopulation to give each a predicted probability of being a MC. The small numberof businesses in BRES means that we have only been able to construct a model thatyields accurate estimates of the number of MCs at high levels of aggregation. How-ever, there could be differential non-response within these levels. Therefore, in thefuture, we would like to make use of past data to update the predicted probabilities,which should allow for accurate estimates at lower levels of aggregation. In this talk,I will describe briefly the sampling design and the estimation method in the NewOrders survey, present some results from the modelling of the MC status, and thendiscuss data errors in the reporting of the MC status.
Start time 10:50
BAYES LINEAR KINEMATICS IN THE ANALYSIS OF FAILURE
RATESKevin Wilson
Newcastle University, UK
Keywords: Bayesian inference, Bayes linear kinematics, count data, failure rates
Collections of related Poisson counts arise, for example, from numbers of failuresin similar machines or neighbouring time periods. A conventional Bayesian analy-sis requires a rather indirect prior specification and intensive numerical methods forposterior evaluations.An alternative approach using Bayes linear kinematics in which simple conjugatespecifications for individual counts are linked through a Bayes linear belief structureis presented. The use of transformations of the Poisson parameters is proposed. Theapproach is illustrated using an example involving Poisson counts of failures.
76
12.2.4 Session 4d: Graphical Models and Extreme Value Theory
Session Room: A1.01Chair: Guy Freeman
Start time 09:10
UNCERTAINTY IN CHOICE OF MEASUREMENT SCALE FOR
EXTREME VALUE ANALYSISJenny Wadsworth1, Jonathan Tawn1 and Philip Jonathan2
1 Lancaster University2 Shell Technology Centre Thornton
Keywords: Extreme Value Theory, Measurement Scale, Significant Wave Height
The effect of the choice of measurement scale upon inference and prediction fromextreme value models is examined. When measurements of the same process arerecorded on different scales linked by a non-linear transformation, separate extremevalue analyses carried out on the two scales can lead to highly discrepant conclu-sions concerning future extremes of the process. For some distributions it turns outthere is in fact an optimal choice of scale to minimise the bias of the model. Thistalk describes a how a Box-Cox transformation can be incorporated into an analysis,providing a parametric methodology to account for scale uncertainty. An exampledataset of significant wave height measurements is used to illustrate both the prob-lem and the new methodology.
Start time 09:35
MODELLING EXTREMAL PHENOMENA USING DIFFERENT
DATA SOURCESBen Youngman
University of Sheffield, UK
Keywords: Extreme value theory, Spatial modelling
A common problem in the modelling of extremes of phenomena is sparsity or qualityof data. This may be because few extremes have occurred or because extremes aredifficult to measure. A consistent source of non-observational data comes from nu-merical model output, eg. climate models. Typically these provide data of high spa-tiotemporal resolution, yet often poorly capture the behaviour of extremes. Here amethod is proposed to characterise this inaccuracy. This is done by relating the modeloutput to some proximate observational data, both of which theoretically quantifythe same phenomenon.
77
Start time 10:00
PARAMETRISATION OF GRAPHICAL MODELSSimon Byrne
Statistical Laboratory, University of Cambridge, UK
Keywords: Graphical models, Bayesian inference, Covariance matrix estimation
Graphical models have recently become popular tools in statistics and related fields.A graphical model is a joint probability distribution which has certain conditionalindependence properties, known as Markov properties, based on the structure ofa graph. This graph provides both an aid to the human comprehension of complexmultivariate models, as well as a framework for efficient computation of the marginaland conditional distributions, either by exact, approximate or sampling based meth-ods.This talk will focus on the problem of efficiently parameterising families of such dis-tributions. If the parameters for the conditionally independent components them-selves have certain independence properties, so called “hyper Markov properties”,then the problem of parameter estimation, both in a maximum likelihood and Bayesianframework, can be simplifies by local computations. I will provide some examplesand applications of these properties.
Start time 10:25
BAYESIAN INFERENCE FOR SOCIAL NETWORK MODELSAlberto Caimo
University College Dublin, Ireland
Keywords: Exponential random graph models, MCMC methods, Bayesian inference
Exponential random graph models are widely used and studied models for socialnetworks. Despite their popularity, they are extremely difficult to handle from a sta-tistical viewpoint since their normalising constant is available only in very trivialcases. We propose to carry out the estimation using a Bayesian framework via theexchange algorithm of Murray et al. (2006), which circumvents the need to calculatethe normalising constants of the posterior density. Moreover we propose to furtherimprove mixing and local moves on the posterior support using a population MCMCapproach with snooker update. This method improves performance with respect tothe widely used Monte Carlo maximum likelihood estimation whose convergence isoften troublesome.
78
12.2.5 Session 5a: Experimental Design and Population Genetics
Session Room: MS.01Chair: Andrew Simpkin
Start time 11:30
CANONICAL ANALYSIS OF MULTI-STRATUM RESPONSE
SURFACE DESIGNS & STANDARD ERRORS OF
EIGENVALUESMudakkar M. Khadim
School of Mathematical Sciences, Queen Mary University of London, UK
Keywords: Response surface methods, Canonical analysis, Eigenvalues, Multi-stratumDesign
Bisgaard and Ankenman described the double linear regression method to obtain thestandard errors for the eigenvalues in second order response surface models. Butthey discussed this method only for completely randomized error control structure.However, in many industrial experiments, experimenter might not be able to per-form complete randomization and hence might be forced to use the multi-stratumerror control structures of which the Split-plot design is a special case. We have triedto apply the same double linear regression model to multi-stratum error control struc-tures to get the standard errors for the eigenvalues in second order response surfacemodels.
Start time 11:55
D-OPTIMAL DESIGN OF EXPERIMENTS FOR A DYNAMIC
MODEL WITH CORRELATED OBSERVATIONSKieran Martin, Stefanie Biedermann, Susan Lewis, David Woods and EPSRC CASE
project supported by GlaxoSmithKlineUniversity of Southampton, UK
Keywords: experimental design, dynamic models
Models derived from differential equations occur frequently in the pharmaceuticalindustry. Optimal designs for these models are required to gather information formodel fitting. Finding such designs can be problematic: the models will usuallybe non-linear, making the optimal choice of design parameter dependent, and theobservations may be correlated. We aim to find designs which will find accurateestimates of the model parameters while remaining robust to the effects of correla-tion and parameter uncertainty. We find pseudo-BayesianD-optimal designs to meet
79
these objectives, then use a simulation study to assess their robustness by calculat-ing the mean square error for each design. We demonstrate that the designs foundstill perform well when the domains of the prior parameter distributions are mis-specified.
Start time 12:20
VULNERABILITY: A 2ND CRITERION TO DISTINGUISH
BETWEEN EQUALLY-OPTIMAL BIBDSHelen Thornewell
Maths Department, University of Surrey, Guildford, UK
Keywords: Balanced Incomplete Block Designs (BIBDs), Disconnectedness, ObservationLoss, Optimality, Robustness, Selection, Vulnerability
If a Balanced Incomplete Block Design (BIBD) exists for the parameters, it is knownthat these designs are universally optimal. However, if there exists more than oneBIBD with the same parameters, is one design better than the other? Is optimality theonly criterion that needs to be tested at design selection? Are there ways of distin-guishing between non-isomorphic, equally-optimal BIBDs?Many experiments suffer from observation loss during the course of the experiment.This may result in a disconnected eventual design so that not all pairwise treatmentcomparisons can be estimated and the null hypothesis cannot be tested. In order toguard against poor eventual designs, I have introduced a Vulnerability Measure todetermine how likely a design is to becoming disconnected. The formulae dependon the design concurrences. Are some BIBDs more vulnerable than others?My new robustness criterion is compared to other criteria from literature. For exam-ple, Prescott & Mansson (2001) consider the robustness of designs against the lossof any two single observations, which depends on the block intersection sizes. Arethere combinatorial links between block intersections and concurrences? Is the leastvulnerable BIBD for disconnectedness also the most robust BIBD against the loss ofsingle observations? Does one criterion provide more information than the other forcomparison, selection and construction of BIBDs?General theorems, formulae and results will be presented and interactive examplesusing sets of complement BIBDs will be demonstrated in order to answer these ques-tions and more...
Start time 12:45
SURFING IN ONE DIMENSIONEmma Kershaw
University Of Bristol, Statistics Group
Keywords: Coalescent, Population Genetics, Stochastic Processes, Population Expansion
80
Geographical expansions of a population have occurred throughout history, with hu-mans believed to have expanded out of Africa in the last 100,000 years. They are ofparticular interest in the field of evolutionary biology as they can have a drastic effecton the distribution and diversity of genes in the newly colonized area. Such geneticphenomena have been used as markers to indicate possible range expansions in thepast.This talk considers the phenomenon of genes surfing on the wave front of an expand-ing population in one dimension and we introduce some classical statistical popula-tion genetics models. Two simulations are introduced which explore the problemfurther. A forward-in-time model using classical population genetics theory enablesan exact ancestral graph of individuals at the wave front to be constructed and usedas a means of comparison for the second model, an approximate backward-in-timesimulation which attempts to estimate this ancestral distribution using methods ofcoalescent theory.
Start time 13:10
DIMENSION REDUCTION FOR HUMAN GENOMIC SNPVARIATION
Colette Mair and Dr. Vincent MacaulayUniversity of Glasgow, UK
Keywords: population structure, Wright’s island model
We will discuss ways of detecting population structure from genetic data from a setof individuals, each belonging to one population from a genetic collection of popula-tions. The main question of interest is whether the set of individuals belong to a largerhomogenous population or if the population can be segregated into subpopulationsthat are genetically distinct. This is important since a great deal of genetic analysisassumes independence of individual genotypes which may be violated through pop-ulation structure. As a result, not correcting for population structure can result inmisleading results. Further, discovering population structure can help understandthe demographic history of the populations of interest.One of the many issues with such studies is dealing with the large quantity of data.Over the last decade or so, SNP data are becoming widely available in vast quantities.This is the type of data we will consider throughout. A single nucleotide polymor-phism, or SNP, is a position in the DNA sequence which is known to be variable inthe populations of interest. Since we will be dealing with a large number of variables(SNPs), we will consider principal components analysis. This was first introducedto the study of genetic data over 30 years ago and is a common statistical tool forreducing the dimension of data to relatively few components but still accounting fora substantial part of the variation. Each component will capture a proportion of thepopulation structure present in the data, if any. Established software can be usedwhich, given such SNP data and using principal component analysis, can determine
81
if population structure is present in the data. By observing a biplot from a real dataset and also using simulated data, correlations with geographical locations will beconsidered. Such correlations have been observed recently, for example, in Europe.We will firstly consider SNP data from the Human Genome Diversity Panel, consist-ing of roughly 1050 individuals from 50 countries all genotyped at around 650,000SNPs. From there, we will briefly consider simulated data under Wright’s islandmodel. With this model, simulation of SNP’s from a number of populations is pos-sible with the amount of migration between populations controlled. This simplifiedmodel will help illustrate the ideas presented but is only one of many possible mod-els. However it is useful in demonstrating population structure and correlations be-tween geographical and genetic distance.
12.2.6 Session 5b: Censoring in Survival Data and Non-Parametric Statistics
Session Room: MS.04Chair: Jennifer Rogers
Start time 11:30
PARAMETRIC SURVIVAL MODEL WITH TIME-DEPENDENT
COVARIATES FOR RIGHT CENSORED DATAHisham Abdel Hamid Elsayed
Statistics Group, School of Mathematics, University of Southampton, UK
Keywords: Parametric models, Right censoring, Splines, Time-dependent covariates
One standard approach in survival analysis is to use the Cox proportional hazardsregression model. This can easily be extended to incorporate one or more covari-ates whose values are subject to change over time. An alternative and potentiallymore efficient approach is to use simple parametric accelerated failure time mod-els with standard survival distributions such as the Weibull, log-logistic and log-normal. Again these models may be extended to incorporate time-dependent covari-ates. However, in some areas of medical statistics simple parametric models oftenfit poorly. In this paper the standard Weibull regression model is extended to in-corporate time-dependent covariates and made more flexible by using splines. Thecompeting methods are implemented and compared using two large data sets (sup-plied by NHS Blood and Transplant) of survival times of corneal grafts and hearttransplant patients.
82
Start time 11:55
ASSESSING THE EFFECT OF INFORMATIVE CENSORING IN
PIECEWISE PARAMETRIC SURVIVAL MODELSNatalie Staplin
University of Southampton
Keywords: Survival analysis,Informative censoring,Sensitivity analysis,Parametricmodels,Piecewise exponential
Many of the standard techniques used to analyse censored survival data assumethat there is independence between the failure time and censoring processes. Thereare situations where this assumption can be questioned, especially when looking atmedical data. It would be useful to know whether we can assume independence orwhether we need a model that takes account of any dependence. The method pre-sented here assesses the sensitivity of the parameter estimates in parametric modelsto small changes in the amount of dependence between failure time and censoring.Parametric models with piecewise hazard functions are considered to allow a greateramount of flexibility in the models that may be fitted. In particular, piecewise con-stant hazard functions are considered, which means the piecewise exponential modelis being used. This method is applied to a dataset that follows patients registered onthe waiting list for a liver transplant. It suggests that in some cases even a smallchange in the amount of dependence can have a large effect on the results obtained.
Start time 12:20
DEALING WITH CENSORING IN QUALITY ADJUSTED
SURVIVAL ANALYSIS AND COST EFFECTIVENESS
ANALYSISHoward Thom
Biostatistics Unit, University of Cambridge, UK
Keywords: Cost Effectiveness Analysis, Health Economics, Censoring, Inverse ProbabilityWeighting, Bootstrapping
Estimation of average costs and quality adjusted life years is often complicated byheavy censoring in the data, as this censoring is implicitly informative. Simple em-pirical means are biased, and standard survival analysis methods are inappropriate.For the purposes of cost-effectiveness analysis, it is necessary to obtain unbiased es-timates of the means and variances of our quantities. This issue will be illustratedwith a contemporary example comparing the cost-effectiveness of four functionaldiagnostic tests in the diagnosis and management of coronary artery disease. The
83
method of inverse-weighting will be applied to this example, and an analytic formfor variance estimates, derived by Willan et al, will be discussed in comparison witha simple bootstrap method.
Start time 12:45
NONPARAMETRIC PREDICTIVE INFERENCE FOR SYSTEM
RELIABILITYAhmad M AboalkhairDurham University, UK
Keywords: k-out-of-m systems, lower and upper probabilities, nonparametric predictiveinference, redundancy allocation, series-parallel systems, system reliability
Recently, the application of a novel statistical method called nonparametric predic-tive inference (NPI) to problems of system reliability has been presented. In NPI,relatively weak statistical modelling assumptions are made, which is made possibleby the use of lower and upper probabilities to quantify uncertainty, leading to infer-ences which are strongly based on observed data and which explicitly consider futureobservable events. Throughout this work, attention is on lower and upper probabili-ties for system functioning, given binary test results on components, as such it takesuncertainty about component functioning and indeterminacy due to limited test in-formation explicitly into account. Lower and upper probabilities, also known as im-precise probability, have several advantages over classical (precise) probability in re-liability context. Coolen-Schrijner et al (2008) considered systems that are series con-figurations of subsystems, with each subsystem a voting system (’k-out-of-m’ system)which consists of only one type of components, and different subsystems consistingof components of different types. They presented a powerful optimal algorithm forredundancy allocation for such systems, for the situation where components of alltypes have been tested with zero failures found in the tests. MacPhee et al (2009)generalized this to general test results. We present the basic results of NPI for sys-tem reliability, followed by a detailed presentation of optimal redundancy allocationfollowing general component test results, and outline related research challenges.
84
Start time 13:10
NONPARAMETRIC ESTIMATION OF RELIABILITY OF TWO
RANDOM VARIABLES USING KERNEL ESTIMATION OF
DENSITYTomas Toupal
University of West Bohemia, Czech Republic
Keywords: Bivariate distribution, Nonparametric estimation, Reliability, Kernelestimation, Density and distribution function
In this talk there is discussed the problem of the reliability estimation particularly forthe bivariate distribution. In the real situations it may be used in many applications,especially in engineering concepts (as structures, static fatigue, the ageing of concretepressure vessels), medicine, quality control, military service or in a balance of pay-ments.The parametric estimation of a density and distribution function of reliability follow-ing a specified distribution has been discussed extensively in a literature. Hence, inthis talk I will present the kernel estimation of density and distribution function us-ing several types of kernels.In the final part I will use results of the previous estimation and I will demonstratehow to obtain the reliability of the obtained kernel estimation and apply it for theexperimental data of the balance of payments of the Czech Republic. In this case,the reliability is represented by a fact that the total amount of the expenditures is nothigher than the total income.The work has been supported by the grant of Ministry of Industry and Trade of theCzech Republic MPO 2A 2TP1/051.
12.2.7 Session 5c: Time Series and Diffusions
Session Room: MS.05Chair: Alexander Strawbridge
Start time 11:30
SEQUENTIAL INTEGRATED NESTED LAPLACE
APPROXIMATIONArnab Bhattacharya and Simon Wilson
Trinity College Dublin, Ireland
Keywords: Bayesian inference, Sequential methods
85
This work addresses the problem of sequential inference of time series in real time,which will be improved further to deal with spatio-temporal models. The idea isto develop a fast functional approximation scheme so as to perform real-time dataanalysis of unknown quantities, given observations, which are dependent on someunderlying latent variable.The problem is defined as follows: the observed variables Yt, t ∈ N, Yt ∈ Y areassumed to be conditionally independent given the latent process Xt (assumed tobe a GMRF) and the unknown hyperparameters Θ, can have any distribution. Theprimary aim is to estimate the posterior distribution P (x0:t|y1:t, θ) and also the filter-ing density P (xt|y1:t, θ). The computation of these two terms necessarily requires theestimation of the posterior density of Θ. We are interested in providing sequential so-lutions for both P (θ|y1:t) and (xt|y1:t, θ). The new method is motivated by a recentlypublished technique known as Integrated Nested Laplace Transformation (INLA) de-veloped by by Rue et al, 2009. The procedure has already been implemented on Lin-ear Gaussian state-space models with unknown state of the system and covarianceparameters and has proved to be very accurate and fast. We consider implementingit in the generalized case where there is nonlinearity and non-Gaussianity.
Start time 11:55
FINDING CHANGEPOINTS IN A GULF OF MEXICO
HURRICANE HINDCAST DATASETRebecca Killick1, Idris Eckley1, Kevin Ewans2 and Philip Jonathan3
1 Maths & Stats, Lancaster University2 Shell International Exploration & Production, Netherlands
3 Shell Technology Centre Thornton, ChesterKeywords: Changepoints, Likelihood, Schwarz Information Criterion, Bayesian
Information Criterion, GOMOS
Statistical changepoint analysis is used to detect changes in variability within GO-MOS hindcast time-series for significant wave heights of storm peak events acrossthe Gulf of Mexico for the period 1900-2005. To detect a change in variance, thetwo-step procedure consists of (1) validating model assumptions per geographic lo-cation, followed by (2) application of a penalised likelihood changepoint algorithm.Results suggest that the most important changes in time-series variance occur in 1916and 1933 at small clusters of boundary locations at which, in general, the variance re-duces. No post-war changepoints are detected. The changepoint procedure is readilyapplied to other environmental time-series.
86
Start time 12:20
PREDICTION INTERVALS OF THE LOCAL SPECTRUM
ESTIMATEKara Stevens
University of Bristol, UK
Keywords: Time series, locally stationary, Bayesian wavelet shrinkage, localizedautocovariance, local spectrum prediction intervals
Time series data occur in many disciplines such as finance and medicine. Often thereis a dependence structure between time series observations. The typical indicator ofthis dependence is the covariance function. If a time series is second order stationarythen the mean and variance are constant, and the covariance only depends on thetime difference between observations. However many time series are not stationary.One class of non-stationary time series are locally stationary time series that possessslowly evolving second order quantities, such as variance. In these cases models thatassume stationarity are inappropriate and alternative methods should be used.An interesting class are locally stationary wavelet models, which can be used to de-fine a localized autocovariance, calculated from an evolutionary wavelet spectrum.This is similar to the spectrum used to analyse stationary time series in the frequencydomain, but it is expressed within the wavelet domain and changes through time.The evolutionary wavelet spectrum is estimated from data through the wavelet peri-odogram. This quantity is asymptotically unbiased but not consistent.We have developed an empirical Bayesian wavelet shrinkage method to smooth thewavelet periodogram thus improve our estimation of the evolutionary wavelet spec-trum. Our method has the advantage of producing prediction intervals and probabil-ities associated with the evolutionary wavelet estimate. The new methodology willbe compared with current techniques.
Start time 12:45
DISCRETE- AND CONTINUOUS-TIME APPROACHES TO
IMPORTANCE SAMPLING ON DIFFUSIONSDavid Suda
University of Lancaster, UK
Keywords: stochastic calculus, Bayesian inference, computational statistics
In this talk we shall tackle the problem of importance sampling methods for diffu-sions. We first start by approximating an Ito diffusion by a discrete-time Markovchain using the Euler discetization, and then implementing importance sampling
87
methods appropriate for discrete-time Markov chains. This setting is simpler to con-ceive as it only requires the understanding of the Radon-Nikodym derivative forfinite-dimensional distributions. We then look at the problem within a continuous-time context. In this case, one requires the understanding of the Radon-Nikodymderivative with respect to probability measures which are infinite-dimensional. Inactual practice, continuous-time importance sampling is never implemented exactly.However it will be useful in constructing new proposal densities, and it can alsoprove useful in analyzing the asymptotic behaviour of importance sampling weights.Some empirical results based on a simulation study of the above shall also be pre-sented.
Start time 13:10
BAYESIAN INFERENCE FOR DIFFUSIONS BASED ON EXACT
SIMULATIONIsadora Antoniano-Villalobos and Prof. Stephen Walker
University of Kent, UK
Keywords: Univariate diffusions, Exact Simulation, Bayesian non-parametric, Consistency
When a certain phenomena is modelled by means of a real-valued diffusion process,the model is often stated in terms of a stochastic differential equation. Statistical in-ference in this context is then aimed at the estimation of parameters appearing in thedrift and diffusion coefficients of the SDE. When exact simulation via MCMC is usedfor Bayesian estimation, the algorithm introduces latent variables which transformthe model into a Bayesian non-parametric model.In this framework, we propose a way of using the exact simulation algorithm forBayesian estimation of the parameters of a specific family of SDEs. We then studythe consistency of the resulting posterior densities of the parameters involved whenthe number of data points of a single diffusion path grows within a fixed time inter-val.
12.2.8 Session 5d: Probability
Session Room: A1.01Chair: Duy Pham
Start time 11:30
A NEW BIVARIATE GENERALIZED PARETO MODELAntonio A. Ortiz Barranon and Stephen Walker
University of Kent, UK
Keywords: Extreme Value Theory, Generalized Pareto Distribution
88
Recently, Extreme Value Theory (EVT) has become a well developed area of research.However, some open problems in the multivariate case remain, since types of distri-butions present more complications, principally in the dependence structure. So far,the bivariate case is the main focus of the multivariate EVT. One of the concepts thatunderpin this theory is the tail dependence, which is a measure of the dependencebetween two variables given that one of them is extreme. Most of the approachesfound in the literature deal with the problem via the use of copulas.In the present project, we present a model not based on copulas. We deal with thedata with a simple parametric model that leads us to easier computation of the taildependence and that does not involve the difficulties that the copulas models haveshown.
Start time 11:55
BACKWARD INDUCTION AND SUBTREE PERFECTNESSNathan Huntley and Matthias C. M. Troffaes
Durham University, UK
Keywords: Sequential Decision Making, Backward Induction, Separability, PreferenceOrdering, Independence Principle, Normal Form Solutions
When studying solutions to sequential decision problems, an important propertyis subtree perfectness (also called separability and consistency). This states that,roughly, for any subtree of the decision tree, the solution of the subtree equals thesubtree of the solution. Commonly, solutions lacking subtree perfectness have thefollowing behaviour: the subject initially wants to choose X if he were to reach nodeN , but upon reaching N wants to choose Y . This is a significant conflict.Subtree perfectness is, however, a very restrictive property, requiring adherence to apreference ordering and the independence principle. We have found that a weakerform of subtree perfectness, admitting many more possible uncertainty and prefer-ence models, can be introduced. This essentially involves relaxing the ordering re-quirement while maintaining the independence principle. In this talk I will explainwhy this weakening may be acceptable, and make links with backward induction.
Start time 12:20
ON THE CONVERGENCE OF CONTINUOUSLY MONITORED
BARRIER OPTIONS UNDER MARKOV PROCESSESRui Xin Lee and Dr. Vassili Kolokoltsov
University of Warwick, UK
Keywords: Barrier options, Markov chains, Feller process, exit probabilities for continuoustime Markov chains, infinitesimal generator
89
We consider a general barrier option for which expected discounted random cashflow is modelled as
g(ST )I{τA T} + h(SτA)I{τA≤T}where St, t ≥ 0 is a random price process, IC donates the indicator of set C, τA =inf{t ≥ 0 St ∈ A}, g denotes non-negative payoff, h denotes reabate function, A de-notes knock-out range.Given barrier options prices under a given Feller price process (St)t≥0 equipped withcorresponding generator L, Mijatovic and Pitorious (2009) present a novel approxi-mation algorithm by constructing a finite-state continuous-time Markov chains (X(n))so that its generator X is close to L, its law is close to that of (St)t≥0 and its expectedpayoffs approximate S = {St}t≥0.We build on the work in Mijatovic and Pitorious (2009). We study the convergenceof such sequence of finite-state continuous-time Markov chains to S = {St}t≥0 andestablish its rates of convergence.
Start time 12:45
DISTORTION OF PROBABILITY MODELSEva Wagnerova
University of West Bohemia in Pilsen, Czech Republic
Keywords: distortion functions, choice of a model, correction
The choice of a suitable model and its description with a probability distribution isthe beginning of every statistical inference. However, the data do not always fol-low typical (textbooks’) probability distributions. There is a possible solution to thatproblem – to use a distortion function to correct the model.The distortion function is a non-decreasing mapping of the interval [0, 1] into itself. Itis a tool to transform distribution functions. This means it can be used already at thebeginning of the modelling, too. Some useful modifications of goodness-of-fit testsare possible to construct through distortions.In our presentation, we demonstrate some noted distortion functions and their us-age. We show examples of suitable distortion function upon the choice of the model,too.
90
12.2.9 Session 6a: Sponsors’ Talks
Session Room: MS.01Chair: Jennifer Rogers
Start time 14:30
THE INTERNATIONAL BIOMETRIC SOCIETY: WHAT CAN
IT OFFER TO POSTGRADUATE STUDENTS?Richard Emsley
International Biometric Society
This talk will introduce the International Biometric Society, which promotes the de-velopment and application of statistical and mathematical theory and methods inthe biosciences. We discuss how the Society was founded by eminent statisticiansof the day, and how it has now evolved into a truly international society. We focuson the opportunities available to postgraduate students within the International Bio-metric Society, including the FREE student membership, the activities of the Britishand Irish Region, and details of the 2010 International Biometric Conference takingplace in Brazil in December this year.
Start time 15:05
BAYESIAN DESIGN & ANALYSIS OF EXPERIMENTSPhil Woodward
Pfizer
Bayesian approaches are becoming widely used in the Pharmaceutical Industry, par-ticularly in the earlier stages of drug discovery and development. This talk willpresent on current uses of these methods at Pfizer. It will show how the objectivesof the studies are quantified using the Bayesian probability concept, and how priorknowledge concerning the efficacy of the compounds being tested is formally usedto assess the operating characteristics of the study design. It will also illustrate howmore efficient studies have been designed by incorporating the formal use of suchprior knowledge in the analysis.
Start time 15:40
AN INTRODUCTION TO FOOTBALL MODELLING AT
SMARTODDSRobert Mastrodomenico
SmartOdds
91
Sports modelling presents modern statistics with many interesting and complex prob-lems. As well as the challenge of building models with high predictive utility, thereis also a computational challenge associated with calibrating the models given thevast data sets now available across a wide range of sports. This talk describes someof the work we do at Smartodds by providing an introduction to football modelling,and all the associated problems and challenges. We begin by introducing some ofthe earlier published work in this area, in particular focusing on using generalisedlinear models to model the goals scored by each team in a football match. We discussa range of modelling challenges that typically arise in the field of sports modelling,such as how to take account of home field advantage, how to allow for the differentstrengths of teams, and how to describe the variable nature of team strengths overtime. Following this we discuss what it is like to work for Smartodds, and mentionsome other sports which we are actively researching.
12.2.10 Session 6b: Sponsors’ Talks
Session Room: MS.04Chair: Mouna Akacha
Start time 14:30
MAKING DECISIONS WITH CONFIDENCE - STATISTICS
THE SHELL WAYWayne Jones
Shell
Shell’s Statistics and Chemometrics group provides research and consultancy ser-vices in data analysis and visualisation, statistical modelling, experimental designand statistical software tool development to many Shell businesses in the fields ofcommerce, finance, process development and product development. The group,based at Amsterdam, Chester and Houston, serves clients world-wide.
Start time 15:05
AN INTRODUCTION TO AHLMartin Layton
AHL, Man Group PLC
In this presentation I will talk about AHL, a quantitative hedge fund with a 20 yeartrack record of profitably trading financial markets using model-based, systematicapproaches. After introducing AHL, I will walk through the process of creating andevaluating a simple trading system. Finally, time permitting, I will talk through someof the current areas of research within our group.
92
12.2.11 Session 6c: Sponsors’ Talks
Session Room: MS.05Chair: Flavio B Goncalves
Start time 15:05
SUPPORT FROM THE RSS AND THEIR YOUNG
STATISTICIANS SECTIONHelen Thornewell
Young Statisticians Section
The Royal Statistical Society is the professional body for statistics and statisticians inthe UK. The presentation will remind you about the different memberships availableas well as courses and qualifications on offer to support YOU. In particular, informa-tion will be presented about the RSS Young Statisticians Section, including its aims &objectives, a summary of successes since its official launch at the start of 2009, waysto get involved and adverts for upcoming events. Come and find out more aboutYOUR section
Start time 15:40
OPPORTUNITIES IN PROBABILITY AND STATISTICAL
MODELLING AT LLOYDS BANKING GROUP DECISION
SCIENCEBill Fite
Lloyds Banking Group
‘There are no problems, only opportunities’ - Jacques Benacin
93
13 Poster Abstracts by Author
NONPARAMETRIC PREDICTIVE INFERENCE FOR SYSTEM
FAILURE TIMEAbdullah Al-NefaieeDurham University, UK
Keywords: Lower and upper probabilities, Nonparametric predictive inference, Systemreliability
Nonparametric predictive inference (NPI) is a recently developed statistical frame-work which makes few modelling assumptions and uses lower and upper proba-bilities to quantify uncertainty. Throughout, we consider the use of NPI to predictreliability of systems, given failure times of tested components which are exchange-able with components used in the system considered. We present some main ideas,and these ideas are illustrated and discussed via examples. We also include a briefoutline of main research challenges.
A COMPARISON OF BAYESIAN SPACE-TIME MODELS FOR
OZONE CONCENTRATION LEVELSKhandoker Shuvo Bakar
School of Mathematics, University of Southampton
Keywords: Space-time modelling, ozone centrations, auto-regressive model, dynamic linearmodel, Bayesian spatial prediction
Recently, there has been a surge of interest in space-time modelling of ozone con-centration levels. Well known time series modelling methods such as the dynamiclinear models (DLM) and the auto-regressive (AR) models are being used togetherwith the Bayesian spatial prediction (BSP) methods adapted for dynamic data. Asa result, the practitioners in this field often face a daunting task of selection amongthese methods. This paper presents a study comparing three approaches: the DLMapproach of Huerta et al. (2004), the BSP method as described by Le and Zidek (2006),and the AR models proposed by Sahu et al. (2007). Recent theoretical results (Dou etal., 2009) comparing the first two approaches are extended to include the AR mod-els. The results are illustrated with a realistic numerical simulation example usinginformation regarding the location of the ozone monitoring sites and observed ozoneconcentration levels in the state of New York in 2005-2006 for months June and July.
94
The speed of computation, the availability of high-level software packages for imple-menting the methods, and the practical difficulties for using the methods for largespace-time data sets are also investigated.
BIAS IN MENDELIAN RANDOMIZATION FROM WEAK
INSTRUMENTSStephen Burgess and Simon G. Thompson
MRC Biostatistics Unit, University of Cambridge
Keywords: Genetic epidemiology, Mendelian randomization, Causality, Weak instruments,Finite sample bias
A common epidemiological question of interest is whether an observed correlationbetween a risk factor and a disease is a true causal association. Mendelian randomiza-tion is a technique for determining the causal association between a risk factor and anoutcome in the presence of several possibly unmeasured confounders. A genetic vari-ant is sought, by means of which, under certain assumptions, a causal association canbe estimated. However, even when the necessary underlying assumptions are valid,estimates from analyses using genetic variants which are not strongly associated withthe risk factor are biased. This bias, which acts in the direction of the observationalassociation between risk factor and disease, if not correctly acknowledged, may con-vince a researcher that an observed observational association is causal, when in factthere is no true association.
USING DYNAMIC STAGED TREES FOR DISCRETE TIME
SERIES DATA: ROBUST PREDICTION, MODEL SELECTION
AND CAUSAL ANALYSISGuy Freeman and Jim Q. Smith
University of Warwick, Coventry, UK
Keywords: Staged trees, Bayesian model selection, Bayes factors, forecasting, discrete timeseries, causal inference, power steady model, multi-process model
The class of chain event graph models is a generalisation of the class of discreteBayesian Networks, retaining most of the structural advantages of the Bayesian Net-work for model interrogation, propagation and learning, while more naturally encod-ing asymmetric state spaces and the order in which events happen. We demonstratehere how with complete sampling, conjugate closed form model selection based on
95
product Dirichlet priors is possible for this class of models. We demonstrate ourtechniques using a simple educational example, and go on to discuss possible futureenhancements to and applications of this model class.
FINDING CHANGEPOINTS IN A GULF OF MEXICO
HURRICANE HINDCAST DATASETRebecca Killick1, Idris Eckley1, Kevin Ewans2 and Philip Jonathan3
1 Maths & Stats, Lancaster University2 Shell International Exploration & Production, Netherlands
3 Shell Technology Centre Thornton, ChesterKeywords: Changepoints, Likelihood, Schwarz Information Criterion, Bayesian
Information Criterion, GOMOS
Statistical changepoint analysis is used to detect changes in variability within GO-MOS hindcast time-series for significant wave heights of storm peak events acrossthe Gulf of Mexico for the period 1900-2005. To detect a change in variance, thetwo-step procedure consists of (1) validating model assumptions per geographic lo-cation, followed by (2) application of a penalised likelihood changepoint algorithm.Results suggest that the most important changes in time-series variance occur in 1916and 1933 at small clusters of boundary locations at which, in general, the variance re-duces. No post-war changepoints are detected. The changepoint procedure is readilyapplied to other environmental time-series.
ON THE CONVERGENCE OF CONTINUOUSLY MONITORED
BARRIER OPTIONS UNDER MARKOV PROCESSESRui Xin Lee and Dr. Vassili Kolokoltsov
University of Warwick, UK
Keywords: Barrier options, Markov chains, Feller process, exit probabilities for continuoustime Markov chains, infinitesimal generator
We consider a general barrier option which expected discounted random cash flowis modelled as
g(ST )I{τA T}+ h(SτA)I{τA≤T}where St, t ≥ 0 is a random price process, IC donates the indicator of set C, τA =inf{t ≥ 0 St ∈ A}, g denotes non-negative payoff, h denotes reabate function, A de-notes knock-out range.Given barrier options prices under a given Feller price process (St)t≥0 equipped with
96
corresponding generator L, Mijatovic and Pitorious (2009) present a novel approxi-mation algorithm by constructing a finite-state continuous-time Markov chains (X(n))so that its generator X is close to L, its law is close to that of (St)t≥0 and its expectedpayoffs approximate S = {St}t≥0.We build on the work in Mijatovic and Pitorious (2009). We study the convergenceof such sequence of finite-state continuous-time Markov chains to S = {St}t≥0 andestablish its rates of convergence.
MULTI-ARMED BANDIT WITH REGRESSOR PROBLEMSBenedict May and Dr. David Leslie
University of Bristol, UK
Keywords: Bandit Problem, Reinforcement Learning, Linear Regression, NonparametricRegression
The multi-armed bandit problem is a simple example the exploitation/explorationtrade-off generally inherent in reinforcement learning problems. An agent is taskedwith learning from experience how to sequentially make decisions in order to max-imize average reward. In the extension considered, the agent is presented with aregressor before making each decision. The agent has to balance the tendency toexplore apparently sub-optimal actions (in order to improve regression function es-timates) against the tendency to exploit the current estimates (in order to maximisereward). Study of several past approaches to similar problems has indicated particu-lar desirable properties for the policy used. These properties motivate the choice andstudy of the algorithm that features in this work. The theoretical properties of thealgorithm have been studied and it has been tested on both linear and nonparametricregression problems. The intuitive algorithm has useful convergence properties and,compared to many conventional methods, performs well in simulations.
ADAPTIVE ANALYSIS AND DESIGN OF MULTIVARIATE
NORMAL RESPONSE STUDY WITH APPLICATION IN FMRISTUDIES
Giorgos Minas, Dr. F. Rigat, Dr. J. Aston, Prof. N. Stallard and Dr. T.NicholsDepartment of Statistics, University of Warwick
Keywords: Multivariate Normal Distribution, Power, prior/posterior distribution, MonteCarlo approximation
97
We propose a two-stage adaptive design for a study with multivariate normal re-sponse where an overall effect is way more important than local effects. A linearcombination of the marginals of the second-stage response is the main endpoint. Theweights of the linear combination are chosen using the pilot data of the first stagesuch that power is maximised. Power is defined as the expectation of the rejectionprobability for the z-test (or t-test) of the linear combination where expectation istaken over the posterior distribution of the mean (and variance if unknown) of themultivariate response. The analytic expression for the optimal weighting under anidentifiability constraint is given. The power under the optimal weighting is approx-imated using Monte Carlo approximation and sample size requirements for the twostages are provided. Application in fMRI studies is explored.
BAYESIAN ANALYSIS IN MULTIVARIATE DATARofizah Mohammad and Dr. Karen Young
University of Surrey, UK
Keywords: Model choice, Bayes factors, Influential observations
In this study, we consider Bayesian model selection in multivariate normal data usingthe well-known Bayes factor. The standard improper priors are used for the param-eter model. The device of imaginary observations is used to determine the ratio ofunspecified constant in the Bayes factors. We discuss a few different models. Thediagnostic kd is used to assess the influential observation on model choice based onBayes factors method. The calculations are illustrated using simulation data and Irisdata sets.
MINKOWSKI FUNCTIONAL IN IMAGE ANALYSISNoratiqah Mohd Ariff and Dr. Elke Thonnes
University of Warwick, UK
Keywords: Minkowski functional, Boolean model
Various lung diseases, such as emphysema or pulmonary fibrosis, lead to structuraldeformations in lung tissue. These become apparent as textural changes in high res-olution CT scans of the lung. One natural set of descriptors that may be used toquantify textural changes are the so-called Minkowski functionals or intrinsic vol-umes from integral geometry. These are related to more commonly known mea-sures of shape, curvature and connectivity. In this work, methods of computingthe Minkowski functionals from digital images are discussed and their accuracy aretested via standard models in stochastic geometry where the mean Minkowski func-tionals are already known analytically.
98
EXACT DISTRIBUTIONS AND SEQUENTIAL MONTE CARLO
FOR CHANGE POINTSChristopher Nam, John Aston and Adam Johansen
Department of Statistics, University of Warwick, UK
Keywords: Change Point analysis, Hidden Markov Models, Finite Markov ChainImbedding, Sequential Monte Carlo Samplers
Quantifying the uncertainty in the locations of change points is a topic of increasinglysignificant interest with various application areas including economics and genetics.This poster will review an existing methodology in calculating change point distri-butions using general finite state Hidden Markov Models (HMMs) for a sequence ofdata. A change point is defined to have occurred when a run of a particular state hasoccurred consecutively for at least a desired number of time periods. This method-ology generates exact distributions for the location of change points for particularparameter values using Finite Markov chain Imbedding (FMCI). The use of FMCIextends the original posterior Markov chain to a new Markov chain, such that theprogress of any particular run can also be recorded within the state space. This ul-timately allows the probability distribution function to be characterised completelywithout requiring any asymptotic arguments or being influenced by sampling error.However, as these parameter estimates are themselves subject to uncertainty, themethodology is extended to generate samples from the parameter distributions usingSequential Monte Carlo (SMC). This in turn allows for a more complete characterisa-tion of the distribution of change points to be computed. The extended methodologybenefits from the use of exact conditional distributions within the SMC, and thus be-ing computationally more efficient than other approaches where state estimates foreach time point are required.
SAMPLE SIZE RE-ESTIMATION IN CLINICAL TRIALS WITH
MULTIPLE ENDPOINTSIves Ntambwe, Tim Friede and Nigel Stallard
Warwick Medical School, University of Warwick, UK
Keywords: Bonferroni, multiple endpoints, sample size re-estimation, familywise error rate
The choice of an appropriate sample size is a main concern in the design of any clin-ical trial. In the planning stage of a trial one is often quite uncertain about the sizesof parameters or assumptions needed for sample size calculations. The idea of this
99
project is to explore the use of designs that allow checking of these assumptions andadjustment of the sample size if necessary.Designs with sample size re-estimation, also called designs with internal pilot study(IPS), are conducted to look at assumptions regarding the nuisance parameters.Multiple endpoints are not uncommon in clinical research. One example is the useof a test battery in schizophrenia. The analysis is complicated in the presence of mul-tiple endpoints and special techniques are needed to control the Type I error ratebecause hypotheses are tested for various endpoints.This project aims to bring together the concept of designs with sample size re-estimationand the methodology for dealing with multiple endpoints. This will provide an ex-tension of the current methodology for single endpoint sample size re-estimation tothe multiple outcomes setting.Preliminary results have been based on the use of a Bonferonni correction and showthat despite misspecification of the nuisance parameters at the planning stage, thepower is maintained when performing sample size re-estimation.
MODELLING AIR POLLUTION AND ITS RELATIONSHIP TO
HEALTHOyebamiji Oluwole, Dr. Alison Gray and Prof. Chris RobertsonMathematics and Statistics, University of Strathclyde, Glasgow, UK
Keywords: Air pollution, Spatial and temporal modelling, Time series
Atmospheric pollution is any substance capable of altering the natural compositionof air and causing harm to both humans and their environment. The adverse effectsof airborne pollutants upon human health have been well established. The aim of thecurrent work is to model sulphur dioxide (S02) levels in Scotland and to relate theseto health. The method we are adopting is to concentrate on systematic trend sur-faces which provide basic descriptions of the patterns of the data both spatially andtemporally by incorporating these two attributes in a generalized additive regressionmodel.The study uses S02 data from 41 stations monitoring air pollutants over Scotland, ob-tained from the UK Air Quality Archive data website (www.airquality.co.uk/data).The data used covers the years 1996-2007, comprising 3653 days in the entire studyperiod. The data represent daily mean S02 concentrations. We also have data on thegeographical locations of the sites (Easting and Northing). There is missing data andnot all stations have measurements for all the years. Descriptive analysis of the datahave been carried out and the results of investigating ARMA and ARIMA modelsfor time series modelling and imputation of missing data will be shown. Furthermodelling will involve use of generalized additive models to incorporate the dataattributes of both space and time, before linking the modelled S02 levels to variablesconcerning human health.
100
INFERENCE ABOUT THE RATIO OF TWO NORMAL MEANS
FOR PAIRED OBSERVATIONSFrancisco J. Rubio
University of Warwick, UK
Keywords: Normal Ratio Distribution, Paired Observations, Reference Analysis
In order to make inferences about the ratio of Normal means β in the case of pairedindependent Normal random variables (X, Y ), appropriate statistical models havebeen given in statistical literature. However, in other scientific disciplines such asCytometry, Physiology, and Medicine, the distribution of the corresponding ratio ofthe Normal variables or a Normal approximation to this distribution are used to es-timate β. It has been reported (Merril, (1928), Marsaglia, (1965), Kuete et. al. (2000))that the distribution of the ratio of two independent Normal variables Z = X/Ycould be bimodal, or asymmetric, or symmetric, or similar to a Normal distributionunder some conditions on the parameters. These conditions have been settled downthrough simulations and empirical results. In the revised literature there is a lack ofassessment of the error made with these procedures. The goal of the present work isto quantify and characterize this error in terms of certain conditions on the parame-ters of the Normal variables. In addition, a result about the existence of a Normalapproximation to the distribution of Z when the means of X and Y are positive ispresented. Finally, the reference posterior distribution of the ratio of two positiveNormal means is analysed.
WAVELET METHODS FOR BRAIN IMAGING ANALYSISYiqin Shen and Dr. J.A.D. Aston
Department of Statistics, University of Warwick
Keywords: Wavelet, Brain Imaging Analysis, fMRI, pre-whitening
The human brain can now be studied using neuroimaging techniques such as func-tional Magnetic Resonance Imaging (fMRI). fMRI data is four dimensional; three spa-tial dimensions and one temporal dimension. When modelling the time dimension,the traditional way is to estimate linear model parameters using least squares modelfitting. An alternate way is proposed in the following steps: first, do a wavelet trans-form for the data in the space dimension; second, on the coarsest wavelet coefficients,estimate parameters using standard linear regression. These parameters can then beused to construct a prior to estimate the parameters in the rest of the hierarchical
101
wavelet structure by Bayesian linear regression. A normal prior, scaled using the pre-vious parameter estimates, is used, as the noise increases at higher resolutions andthe Bayesian framework consequently bounds the estimates and smoothly shrinksthe estimated parameters towards zero (helping remove noise); third, apply the in-verse wavelet transform for the estimated parameters and thus obtain the parametersfor all locations in the original image space.The errors are not independent in time, so it is necessary to pre-whiten the data anddesign matrices using the autocorrelation of each time series. However, when usingwavelet method with autocorrelations in different time series model, the design ma-trices of each spatial location need to be kept identical. Thus, we examine throughsimulation an approximation using a global value of autocorrelation for the designmatrices and apply this to real data.
102
14 RSC 2011: Cambridge University
34th Research Students’ Conference in
Probability and Statistics
Cambridge
4th - 7th April 2011
Centre for MathematicalSciences
103
15 Sponsors’ Advertisements
`
This year the conference is being financially supported by CRiSM. CRiSM (Centre for Research in Statistical Methodology) is an EPSRC supported initiative to build capacity in Statistics within UK. It lives within the Statistics Department at Warwick, and funds three academic positions, five postdoctoral research associates and many PhD students. In addition it organises many workshops and conferences and has an energetic visitor programme. Its director is Gareth Roberts. Further information about its activities can be found at http://www2.warwick.ac.uk/fac/sci/statistics/crism/
Forthcoming CRiSM Workshops:
Tue, Apr 20, '10
CRiSM Workshop: Continuous-time and continuous space processes in ecology
Runs from Tuesday, April 20 to Wednesday, April 21.
Sun, May 30, '10 CRiSM Workshop: Model Uncertainty
Runs from Sunday, May 30 to Tuesday, June 01.
Mon, Jul 12, '10
CRiSM Workshop: Orthogonal Polynomials and Application in Statistics and Stochastic Processes
Runs from Monday, July 12 to Thursday, July 15.
Mon, Mar 28, '11
CRiSM Workshop: InFer (Inference for Epidemic-related Risk)
Runs from Monday, March 28 to Friday, April 01.
104
ATASS Sports is a leading statistical research
consultancy business providing clients with
high‑quality sports models and predictions.
We are currently looking to fill a range of full‑time
positions to work as part of our existing and
planned research teams. We have both senior and
junior positions available for applied statisticians,
mathematical modellers, database managers
and IT support staff. Generous salary and benefits
packages are the norm and depend upon position,
qualifications and experience. All posts will be
based at our newly developed office complex on
the Exeter business park. Our new recruits will
work within close‑knit multi‑skilled teams to model
a variety of sports, obtaining and incorporating real‑
time information, and developing and applying novel
statistical and mathematical modelling techniques.
The closing date for this round of appointments
is April 22nd 2010. Applications or requests
for further information should be addressed to
Steve Brooks via [email protected]. Additional
information can also be found on our web site –
www.atassltd.co.uk.
ATASS is committed to equality and values diversity.
We welcome applications from all suitably qualified
individuals ‑ see web site for details.
Statistical Modelling Vacancies
www.atassltd.co.uk
Innovation in Sports Modelling
105
Work for AHL in OxfordFor more information visit www.ahl.com
Candidates must have a PhD or equivalent qualifi cation in a quantitative discipline
Get the benefi ts of a City careerwhilst working alongside world class academics
106
Statistics
Then play a vital role in developing new life-saving, life-enhancing drugs for people worldwide. Join our Statistics group at Pfizer, and you’ll help one of the world’s largest pharmaceutical companies with the discovery, development and trial of new molecules and compounds that improve lives.
your reach?Want to extend
107
Statistics and ChemometricsMaking decisions with confidence
About us
The Statistics and Chemometrics team includes statisticians, data analysts,
chemometricians and modellers who help clients in the commerce, finance,
process and product development industries to develop better business solutions.
The team draws on Shell Group experience of providing cutting-edge
consultancy, software, innovation and training for more than 30 years to serve
clients worldwide from bases in the UK, the Netherlands and the USA.
Website:
http://www.shell.com/globalsolutions/statisticsandchemometrics
Email:
108
Shell Global Solutions is a network of independent technology companies in the Shell
Group. In this case study, the expressions ‘Shell’ and ‘Shell Global Solutions’ are sometimes
used for convenience where reference is made to these companies in general, or where no
useful purpose is served by identifying a particular company.The information contained in
this material is intended to be general in nature and must not be relied on as specific advice
in connection with any decisions you may make. Shell and Shell Global Solutions are not
liable for any action you may take as a result of you relying on such material or for any loss
or damage suffered by you as a result of you taking this action. Furthermore, these materials
do not in any way constitute an offer to provide specific services. Some services may not
be available in certain countries or political subdivisions thereof. Photographs are from
various installations. Copyright © 2008 Shell Global Solutions International BV. All rights
reserved. No part of this publication may be reproduced or transmitted in any form or by
any means, electronic or mechanical including by photocopy, recording or information
storage and retrieval system, without permission in writing from Shell Global Solutions
International BV.
GS1427690308-En(A)
Process Solutions
� Statistical process modelling
� Process solutions and software for optimising
performance and operating cost, e.g. pre-heat units
� Process software for assuring integrity of pipework
� Tools and techniques to emulate and optimise
process conditions
Product Development
� Supporting product development in fuels,
lubricants, chemicals: e.g. vehicle testing,
emission testing
� Designing experiments to provide evidence
to support marketing claims
� Analysing data to help understand effects
� Collaborating with product teams to provide
on going support
Training and Software
� Customised statistical training courses -
statistics and design of experiments training
� Customised training on use of specialist
software tools
Chemometrics
� Process Analytical Chemistry using advanced spectroscopy with multivariate calibration models,
e.g. for MOGAS blending
� Advanced Process Monitoring using multivariate statistical techniques, e.g. Dynamic Chemical
Processes
� Enhanced Experimentation, e.g. catalyst characterisation Electron Microscopy, X-Ray Analysis,
Kernels, Multivariate Analysis
Business Solutions
� Statistical forecasting
� Decision tools on Carbon management
� Risk and uncertainty modelling
� Benchmarking advanced data analysis
109
advert.indd 1 10/3/10 03:22:10
110
Sign up to our mailing listto win a Sony e-ReaderCarry 100s of electronic books on a slim, lightweight digital Sony e-book readerand access your essential Wiley collection wherever and whenever you want.
We’re offering a fantastic opportunity to win a Sony eBook reader*.For your chance to win simply sign up to our mailing list atwww.wiley.com/go/win
Entrants will be entered into an additional monthly draw for thechance to win a £100 Wiley voucher to spend on Wiley books.
www.wiley.com/go/win
*Full terms and conditions can be found at www.wiley.com/go/win
WIN!
111
20% conference discount on selected Probability
and Statistics titles
For a limited time only!
Coming this summer:
Bayesian Decision Analysis Principles and Practice
Jim Q. Smith Hardback 9780521764544 | GBP c. 35.00
www.cambridge.org
To view all the titles included in this offer please visit www.cambridge.org/RSC2010
112
www.rss.org.uk/rss2010
RSS 2010International Conference
Brighton, 13–17 September
Topics include� climate change modelling� trust in official statistics� 2011 census� measuring progress� adaptive clinical trials� performance indicators� composite likelihood� MCMC� data capture� risk� statistical literacy
A forum for presentation and discussionof methodological developments andareas of application for statisticians andusers of statistics.
An opportunity for statisticians, analysts, researchersand other users of statistics from all sectors to sharecurrent research and insights.
Confirmed speakers� Tim Davis (Jaguar Land Rover)� Peter Donnelly (University of Oxford)� Nancy Reid (University of Toronto)
Key dates19 April Extended deadline for RSC attendees24 May Deadline for grant and bursary applications7 June Deadline for ‘double discount’ registration9 August Deadline for discounted registration8 September Deadline for pre-event registration
RSS2010_AD_A5_portrait 15/3/10 14:23 Page 1
113
114
16 RSC History
34 2011 Cambridge33 2010 Warwick32 2009 Lancaster31 2008 Nottingham30 2007 Durham29 2006 Glasgow28 2005 Cambridge27 2004 Sheffield26 2003 Surrey25 2002 Warwick24 2001 Newcastle23 2000 Cardiff and University of Wales22 1999 Bristol21 1998 Lancaster20 1997 Glasgow19 1996 Southampton18 1995 Oxford17 1994 Reading16 1993 Lancaster15 1992 Nottingham14 1991 Newcastle13 1990 Bath12 1989 Glasgow11 1988 Surrey10 1987 Sheffield(9)(8) 1985 Oxford(7) 1984 Imperial College(6)(5)(4)(3) 1982 Cambridge(2) 1981 Bath(1) 1980 Cambridge
Table 1: RSC History
115
17D
eleg
ate
List
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Ahm
adM
oham
mad
Abo
alkh
air
Dur
ham
Uni
vers
ity
2a.
m.a
boal
khai
r@du
rham
.ac.
ukN
onpa
ram
etri
cPr
edic
tive
Infe
renc
e(N
PI)-
Syst
emR
elia
bilit
y
Hol
lyA
insw
orth
New
cast
leU
nive
rsit
y1
h.f.a
insw
orth
@nc
l.ac.
ukBa
yesi
anM
odel
ling
for
Ecol
ogy
Mou
naA
kach
aU
nive
rsit
yof
War
wic
k3
M.A
kach
a@w
arw
ick.
ac.u
kLo
ngit
udin
alD
ata,
Mis
sing
Dat
a,N
on-L
inea
rM
ixed
Mod
els
Faiz
aA
liD
urha
mU
nive
rsit
y2
f.f.a
li@du
r.ac.
ukba
yes
linea
rst
atis
tics
Abd
ulla
hH
.Al-
nefa
iee
Dur
ham
Uni
vers
ity
2a.
h.al
-nef
aiee
@du
rham
.ac.
ukN
onpa
ram
etri
cpr
edic
tive
infe
renc
efo
rsy
stem
failu
reti
me
Muh
anna
dF.
K.A
l-sa
adon
yU
nive
rsit
yof
Plym
outh
1m
uhan
nad.
alsa
adon
y@pl
ymou
th.a
c.uk
Stoc
hast
icIn
tegr
alan
dA
pplic
atio
nto
Fina
nce
Osv
aldo
Ana
clet
o-Ju
nior
The
Ope
nU
nive
rsit
y1
o.an
acle
to-ju
nior
@op
en.a
c.uk
Tim
eSe
ries
,Bay
esia
nFo
reca
stin
g,Tr
affic
Mod
ellin
g
Isad
ora
Ant
onia
no-V
illal
obos
Uni
vers
ity
ofK
ent
2ia
57@
kent
.ac.
ukBa
yesi
anin
fere
nce
for
Mar
kov
cont
inuo
usre
al-v
alue
dpr
oces
ses
116
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Nor
atiq
ahM
ohd
Ari
ffU
nive
rsit
yof
War
wic
k1
strj
ab@
war
wic
k.ac
.uk
Stat
isti
calI
mag
eA
naly
sis
Loui
sJM
Asl
ett
Trin
ity
Col
lege
Dub
lin2
loui
s@m
aths
.tcd.
ieBa
yesi
anin
fere
nce
and
relia
bilit
yth
eory
Nur
iBad
iN
ewca
stle
Uni
vers
ity
1n.
h.ba
di@
new
cast
le.a
c.uk
Gen
eral
ized
linea
rm
odel
s
Kha
ndok
erSh
uvo
Baka
rU
nive
rsit
yof
Sout
ham
pton
2ks
b2g0
8@so
ton.
ac.u
kEn
viro
nmen
tal
Mod
ellin
g,Ba
yesi
anA
naly
sis,
Spat
io-t
empo
ral
Mod
-el
ling.
Ant
onio
Arm
ando
Ort
izBa
rran
onU
nive
rsit
yof
Ken
t2
aao3
3@ke
nt.a
c.uk
Extr
eme
Val
ueTh
eory
Paul
Barr
yTr
init
yC
olle
geD
ublin
1ba
rryp
b@tc
d.ie
Baye
sian
Infe
renc
e
Ale
xBe
rrim
anU
nive
rsit
yof
Live
rpoo
l2
adcb
@liv
erpo
ol.a
c.uk
Epid
emio
logy
Arn
abBh
atta
char
yaTr
init
yC
olle
geD
ublin
2bh
atta
ca@
tcd.
ieBa
yesi
anin
fere
nce
and
sequ
enti
alm
etho
ds
Saky
ajit
Bhat
tach
arya
Uni
vers
ity
Col
lege
Dub
lin2
saky
ajit
.bha
ttac
hary
a@gm
ail.c
omLi
near
mod
els,
dele
tion
diag
nost
ics
117
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Ms
Suju
nya
Boon
prad
itU
nive
rsit
yof
Shef
field
1st
p09s
b@sh
effie
ld.a
c.uk
Baye
sian
stat
isti
cs
Phili
ppa
Burd
ett
Uni
vers
ity
ofLe
eds
1m
m08
pmb@
leed
s.ac
.uk
Bioi
nfor
mat
ics
Step
hen
Burg
ess
MR
CC
ambr
idge
2st
ephe
n.bu
rges
s@m
rc-b
su.c
am.a
c.uk
Men
delia
nra
ndom
izat
ion,
Cau
sal
infe
renc
e,Li
tera
ryw
orks
ofL.
N.
Tols
toy
Sim
onBy
rne
Stat
isti
calL
abor
ator
y2
s.by
rne@
stat
slab
.cam
.ac.
ukG
raph
ical
mod
els,
Baye
sian
stat
isti
cs
Alb
erto
Cai
mo
Uni
vers
ity
Col
lege
Dub
lin2
albe
rto.
caim
o@uc
d.ie
Stat
isti
caln
etw
ork
anal
ysis
,MC
MC
met
hods
,Bay
esia
nst
atis
tics
Joe
Cai
ney
Uni
vers
ity
ofBr
isto
l1
joe.
cain
ey@
bris
tol.a
c.uk
Mon
teC
arlo
,MC
MC
,Ada
ptiv
eM
C,S
eque
ntia
lMC
Jona
than
Cai
rns
Uni
vers
ity
ofC
ambr
idge
,Dep
tofO
ncol
ogy
1jm
c200
@ca
m.a
c.uk
Com
puta
tion
alBi
olog
y,M
arko
vC
hain
Mon
teC
arlo
Soha
ilC
hand
Uni
vers
ity
ofN
otti
ngha
m3
pmxs
c1@
nott
ingh
am.a
c.uk
Tim
ese
ries
,Reg
ress
ion
anal
ysis
,Boo
tstr
apm
etho
s
Thom
asD
essa
inD
urha
mU
nive
rsit
y1
t.j.d
essa
in@
durh
am.a
c.uk
App
lied
Prob
abili
ty,d
iscr
ete
and
cont
inuo
usti
me
mar
kov
proc
esse
s
118
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Car
aD
oole
yN
UI,
Gal
way
1ca
rado
oley
@gm
ail.c
omSu
rviv
alA
naly
sis,
Frai
lity
Mod
els
Susa
nD
oshi
Uni
vers
ity
ofBa
th2
s.k.
dosh
i@ba
th.a
c.uk
Imag
ean
alys
is,c
one-
beam
CT,
imag
e-gu
ided
radi
othe
rapy
Fadl
alla
Elfa
daly
The
Ope
nU
nive
rsit
y2
f.elf
adal
y@op
en.a
c.uk
Baye
sian
Stat
isti
cs,S
ubje
ctiv
ePr
ior
Elic
itat
ion
Elen
iElia
Uni
vers
ity
ofN
otti
ngha
m1
elen
aelia
3@ho
tmai
l.com
Mod
ellin
gho
spit
alsu
perb
ugs
His
ham
Abd
elH
amid
Elsa
yed
Uni
vers
ity
ofSo
utha
mpt
on3
hish
asta
t@ya
hoo.
com
Surv
ival
Ana
lysi
s
Mar
ina
Evan
gelo
uM
RC
Cam
brid
ge1
mar
ina.
evan
gelo
u@m
rc-b
su.c
am.a
c.uk
Stat
isti
calg
enet
ics
and
Bioi
nfor
mat
ics,
Gen
ome
wid
eas
soci
atio
nan
al-
ysis
and
Path
way
Gen
ome
wid
ean
alys
is
Felic
ity
Kim
Evis
onU
nive
rsit
yof
Dur
ham
1fe
licit
y.ev
ison
@du
rham
.ac.
ukPu
blic
Hea
lth
Stat
isti
cs,M
edic
al,S
tati
stic
alG
enet
ics
Sean
Ewin
gsU
nive
rsit
yof
Sout
ham
pton
2sm
e1v0
7@so
ton.
ac.u
kD
iabe
tes
Chr
isto
pher
Falla
ize
Uni
vers
ity
ofLe
eds
3ch
risf
@m
aths
.leed
s.ac
.uk
Stat
isti
cals
hape
anal
ysis
,str
uctu
ralb
ioin
form
atic
s
119
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Ais
haFa
yom
iU
nive
rsit
yof
Not
ting
ham
3La
vend
ers-
love
r@ho
tmai
l.co.
ukM
ulti
vari
ate
Ana
lysi
s
Veri
tyFi
sher
Uni
vers
ity
ofSo
utha
mpt
on1
vaf1
g09@
soto
n.ac
.uk
Expe
rim
enta
lDes
ign
Ash
ley
Parr
yFo
rdU
nive
rsit
yof
War
wic
k2
a.p.
ford
@w
arw
ick.
ac.u
kM
CM
C,E
pide
mic
s
Ann
aFo
wle
rIm
peri
alC
olle
geLo
ndon
1a.
fow
ler0
9@im
peri
al.a
c.uk
Baye
sian
stat
isti
cs;s
tati
stic
alge
neti
cs
Guy
Free
man
Uni
vers
ity
ofW
arw
ick
4g.
free
man
@w
arw
ick.
ac.u
kBa
yesi
anst
atis
tics
,cau
salit
y,gr
aphi
calm
odel
s
Joth
amG
audo
inU
nive
rsit
yof
Sout
ham
pton
1J.P
.K.G
audo
in@
soto
n.ac
.uk
Baye
sian
mod
ellin
gfo
rbi
nary
data
Isab
ella
Gol
lini
Uni
vers
ity
Col
lege
Dub
lin2
isab
ella
.gol
lini@
ucd.
ieM
odel
base
dcl
uste
ring
,Mix
ture
mod
els,
Net
wor
km
odel
s
Flav
ioB
Gon
calv
esU
nive
rsit
yof
War
wic
k3
F.B.
Gon
calv
es@
war
wic
k.ac
.uk
Baye
sian
infe
renc
efo
rdi
ffus
ions
Aim
eeG
ott
Lanc
aste
rU
nive
rsit
y1
a.go
tt@
lanc
aste
r.ac.
ukW
avel
ets
120
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Seun
gjin
Han
Uni
vers
ity
ofSh
effie
ld1
s.ha
n@sh
ef.a
c.uk
Baye
sian
,Tim
eSe
ries
,Sta
tist
ical
Arb
itra
ge
Siti
Rah
ayu
Moh
dH
ashi
mTh
eU
nive
rsit
yof
Shef
field
2st
p08s
m@
shef
field
.ac.
ukM
ulti
vari
ate
qual
ity
cont
rol
Siew
Wan
Hee
Uni
vers
ity
ofW
arw
ick
2s.
w.h
ee@
war
wic
k.ac
.uk
Ada
ptiv
eBa
yesi
ande
sign
,Pha
seII
tria
l
Bryo
nyH
illU
nive
rsit
yof
War
wic
k3
b.j.h
ill@
war
wic
k.ac
.uk
Spat
ialS
tati
stic
s,M
CM
Cs,
Stoc
hast
icG
eom
etry
.
Kir
sty
Hin
chlif
fD
urha
mU
nive
rsit
y1
k.m
.hin
chlif
f@gm
ail.c
omIn
fo-G
apD
ecis
ion
Theo
ry
Nat
han
Hun
tley
Dur
ham
Uni
vers
ity
3na
than
.hun
tley
@du
rham
.ac.
ukFo
unda
tion
sof
Dec
isio
nTh
eory
,Im
prec
ise
Prob
abili
ty
Alb
erto
Alv
arez
Igle
sias
Nat
iona
lUni
vers
ity
ofIr
elan
d,G
alw
ay2
a.al
vare
zigl
esia
s1@
nuig
alw
ay.ie
Reg
ress
ion
Tree
s,C
lass
ifica
tion
Tree
san
dSu
rviv
alTr
ees
Am
inJa
mal
zade
hD
urha
mU
nive
rsit
y3
moh
amm
adam
in.ja
mal
zade
h@du
r.ac.
ukD
ata
Min
ing
Tech
niqu
es;B
ayes
ian
Ana
lysi
s;D
ata
Vis
ualiz
atio
n
Joao
Jesu
sU
nive
rsit
yC
olle
geLo
ndon
3jo
ao@
stat
s.uc
l.ac.
ukIn
fere
nce
Wit
hout
Like
lihoo
d
121
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Emm
aJo
nes
Uni
vers
ity
ofSh
effie
ld2
stp0
8em
j@sh
ef.a
c.uk
Den
droc
hron
olog
y
Cha
itan
yaJo
shi
Trin
ity
Col
lege
Dub
lin3
josh
ic@
tcd.
ieBa
yesi
anM
odel
ling,
Baye
sian
Infe
renc
efo
rD
iffisi
ons
Proc
esse
s
Oye
bam
ijiO
luw
ole
Keh
inde
Uni
vers
ity
ofSt
rath
clyd
e1
wol
emi2
@ya
hoo.
com
Spat
io-t
empo
ralm
odel
ling
ofai
rpo
lluti
on
Emm
aK
ersh
awU
nive
rsit
yO
fBri
stol
1em
ma.
kers
haw
.08@
bris
tol.a
c.uk
App
lied
Prob
abili
ty,P
opul
atio
nG
enet
ics,
Stat
isti
calG
enet
ics
Mud
akka
rM
nas
Kha
dim
Que
enM
ary
Uni
vers
ity
ofLo
ndon
2m
k@m
aths
.qm
ul.a
c.uk
Des
ign
ofex
peri
men
ts
Md.
Has
inur
Rah
aman
Kha
nU
nive
rsit
yof
War
wic
k2
m.h
.rah
aman
-kha
n@w
arw
ick.
ac.u
kBa
yesi
anSt
atis
tics
,Bio
stat
isti
cs,S
ocia
lSta
tist
ics
Mah
mud
aK
hatu
nU
nive
rsit
yof
Stra
thcl
yde
2m
.kha
tun@
stra
th.a
c.uk
Stat
isti
csin
Imag
ePr
oces
sing
Reb
ecca
Kill
ick
Lanc
aste
rU
nive
rsit
y2
r.kill
ick@
lanc
s.ac
.uk
Wav
elet
s,C
hang
epoi
nts,
Non
-sta
tion
ary
Tim
eSe
ries
Jenn
ifer
Hel
enK
lapp
erU
nive
rsit
yof
Leed
s2
jenn
ifer
@m
aths
.leed
s.ac
.uk
Wav
elet
s,V
ague
lett
e-W
avel
ets,
HPL
C
122
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Mar
iaK
onst
anti
nou
Uni
vers
ity
ofSo
utha
mpt
on1
mk2
1g09
@so
ton.
ac.u
kEx
peri
men
talD
esig
n
Kar
olin
aK
rzem
ieni
ewsk
aLa
ncas
ter
Uni
vers
ity
1k.
krze
mie
niew
ska@
lanc
s.ac
.uk
Wav
elet
s
Tom
asz
Lapi
nski
Uni
vers
ity
ofW
arw
ick
1T.
M.L
apin
ski@
war
wic
k.ac
.uk
Prob
abili
tyTh
eory
Mic
hael
Law
ton
MR
CC
ambr
idge
1m
icha
el.la
wto
n@m
rc-b
su.c
am.a
c.uk
Baye
sian
Hie
rarc
hica
lMod
ellin
gan
dSt
ocha
stic
Proc
esse
s
Min
Che
rng
Lee
Uni
vers
ity
ofSo
utha
mpt
on1
mcl
206@
soto
n.ac
.uk
Stat
isti
calD
iscl
osur
eC
ontr
ol
Rui
Xin
Lee
Uni
vers
ity
ofW
arw
ick
1ru
ixin
.lee@
gmai
l.com
finan
cial
mat
hem
atic
s,pr
obab
ility
theo
ry,
mar
kov
proc
esse
s,cr
edit
deri
vati
ves
YeLi
uU
nive
rsit
yof
Lanc
aste
r1
y.liu
10@
lanc
aste
r.ac.
ukEx
trem
eV
alue
Theo
ry
Step
hani
eLl
ewel
ynU
nive
rsit
yof
Shef
field
1s.
llew
elyn
@sh
effie
ld.a
c.uk
Prob
abili
tyan
dSt
atis
tics
Dom
inic
Mag
irr
Lanc
aste
rU
nive
rsit
y1
d.m
agir
r@la
ncas
ter.a
c.uk
Earl
y-ph
ase
clin
ical
tria
ls,a
dapt
ive
desi
gns
123
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Col
ette
Mai
rU
nive
rsit
yof
Gla
sgow
2c.
mai
r@st
ats.
gla.
ac.u
kPo
pula
tion
gene
tics
Patr
ice
Mar
ekU
nive
rsit
yof
Wes
tBoh
emia
4pa
trke
@km
a.zc
u.cz
stat
isti
cs
Kie
ran
Mar
tin
Uni
vers
ity
ofSo
utha
mpt
on2
kjm
2v07
@so
ton.
ac.u
kO
ptim
alde
sign
for
non-
linea
rm
odel
s
Bene
dict
Chr
isti
anM
ayU
nive
rsit
yof
Bris
tol
2bm
2668
@br
is.a
c.uk
rein
forc
emen
tlea
rnin
g,m
ulti
-arm
edba
ndit
s,re
gres
sion
Fion
aM
cEld
uff
Inst
itut
eof
Chi
ldH
ealt
h,U
CL
3f.m
celd
uff@
ich.
ucl.a
c.uk
disc
rete
dist
ribu
tion
s
Dan
ielM
iche
lbri
nkTh
eU
nive
rsit
yof
Not
ting
ham
3pm
xdm
@no
ttin
gham
.ac.
ukM
athe
mat
ical
Fina
nce
Gio
rgos
Min
asU
nive
rsit
yof
War
wic
k1
G.C
.Min
asat
war
wik
.ac.
ukA
dapt
ive
anal
ysis
and
desi
gn,f
MR
I
Erin
Mit
chel
lU
nive
rsit
yof
Lanc
aste
r1
e.m
itch
ell@
lanc
aste
r.ac.
ukN
on-S
tati
onar
yTi
me
Seri
esA
naly
sis,
Dyn
amic
Line
arM
odel
s
Joan
neLo
uise
Mof
fatt
The
Uni
vers
ity
ofSa
lfor
d2
j.l.m
offa
tt@
pgr.s
alfo
rd.a
c.uk
Mod
ellin
gte
chni
ques
used
toin
vest
igat
est
rate
gies
team
s/pl
ayer
sca
nap
ply
toin
crea
seth
eir
chan
ces
ofw
inni
ngin
asp
orti
ngco
ntes
t.
124
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Nur
Ani
sah
Moh
amed
New
cast
leU
nive
rsit
y1
n.a.
moh
amed
@ne
wca
stle
.ac.
ukO
ptim
alD
ynam
icTr
eatm
entR
egim
es
Rofi
zah
Moh
amm
adU
nive
rsit
yof
Surr
ey2
r.moh
amm
ad@
surr
ey.a
c.uk
Baye
sian
Mod
ellin
gin
mul
tiva
riat
eda
ta
Chr
isto
pher
Nam
Uni
vers
ity
ofW
arw
ick
1c.
f.h.n
am@
war
wic
k.ac
.uk
Hid
den
Mar
kov
Mod
els
Stua
rtN
icho
llsLa
ncas
ter
Uni
vers
ity
3s.
nich
olls
@la
ncas
ter.a
c.uk
Late
ntva
riab
lem
odel
s,bi
oeth
ics,
deci
sion
-mak
ing,
atti
tude
mea
sure
-m
ent
Mit
raN
oosh
aQ
ueen
Mar
yU
nive
rsit
yof
Lond
on2
mno
osha
@ho
tmai
l.com
Dis
cord
ancy
betw
een
prio
ran
dda
tain
Baye
sian
Infe
renc
e
Beth
Nor
ris
Uni
vers
ity
ofK
ent
2bn
40@
kent
.ac.
ukst
atis
tica
leco
logy
Ives
Nta
mbw
eW
arw
ick
Med
ical
Scho
ol2
i.l.n
tam
bwe@
war
wic
k.ac
.uk
Med
ical
Stat
isti
cs
Emm
anue
lOlu
segu
nO
gund
imu
Uni
vers
ity
ofW
arw
ick
1E.
O.O
gund
imu@
war
wic
k.ac
.uk
mis
sing
data
,lon
gitu
dina
lstu
dies
and
surr
ogat
em
arke
rsev
alua
tion
incl
inic
altr
ials
125
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Adr
ian
O’H
agan
UC
DD
ublin
3ad
rian
.oha
gan@
hotm
ail.c
o.uk
Mix
ture
Mod
els,
Gen
erat
ive
Dis
crim
inat
ive
Hyb
rids
,Ext
ensi
ons
toth
eEM
Alg
orit
hm
Aid
anO
’Kee
ffe
MR
CC
ambr
idge
2ai
dan.
o’ke
effe
@m
rc-b
su.c
am.a
c.uk
Dyn
amic
caus
alin
fere
nce
and
mul
ti-s
tate
mod
ellin
g
Rac
helO
xlad
eD
urha
mU
nive
rsit
y2
r.h.o
xlad
e@du
rham
.ac.
ukBa
yesi
anst
atis
tics
,Bay
eslin
ear,
unce
rtai
nty
anal
ysis
,com
pute
rsi
mu-
lato
rs,e
mul
atio
n
Ioan
nis
Papa
stat
hopo
ulos
Lanc
aste
rU
nive
rsit
y1
i.pap
asta
thop
oulo
s@la
ncas
ter.a
c.uk
Extr
eme
Val
ueTh
eory
,Bay
esia
nSt
atis
tics
,Tim
eSe
ries
Chr
isto
pher
Pear
ceU
nive
rsit
yof
Live
rpoo
l3
Pear
chrs
9@ao
l.com
Epid
emio
logy
,Sto
chas
tic
mod
els
Duy
Pham
Uni
vers
ity
ofW
arw
ick
2D
uy.P
ham
@w
arw
ick.
ac.u
kPr
obab
ility
,Fin
anci
alM
athe
mat
ics,
Inte
rest
Rat
eM
odel
ling
Mur
ray
Pollo
ckU
nive
rsit
yof
War
wic
k1
mur
ray.
pollo
ck@
war
wic
k.ac
.uk
MC
MC
Bene
dict
Pow
ell
Dur
ham
Uni
vers
ity
1be
nedi
ct.p
owel
l@du
rham
.ac.
ukBa
yesi
anem
ulat
ion,
mul
tiva
riat
esp
atia
lsta
tist
ics
Hel
enPo
wel
lU
nive
rsit
yof
Gla
sgow
2h.
pow
ell@
stat
s.gl
a.ac
.uk
Mod
ellin
gth
eef
fect
sof
air
pollu
tion
126
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Den
nis
Pran
gle
Lanc
aste
rU
nive
rsit
y3
d.pr
angl
e@la
ncas
ter.a
c.uk
Baye
sian
Stat
isti
cs,A
BC,I
nfec
tiou
sD
isea
seM
odel
s
Iain
Proc
tor
Uni
vers
ity
ofG
lasg
ow2
ipro
@ce
h.ac
.uk
Stat
isti
cs,E
colo
gy
Noo
razr
inA
bdul
Raj
akN
ewca
stle
Uni
vers
ity
2no
oraz
rin.
abdu
l-ra
jak@
ncl.a
c.uk
Baye
sian
Expe
rim
enta
lDes
ign
Cla
reEm
ilyR
aych
audh
uri
Bris
tolU
nive
rsit
y2
clar
e.e.
mar
tin@
gmai
l.com
Stoc
hast
icdi
ffer
enti
aleq
uati
ons,
num
eric
alin
tegr
atio
n,ch
aoti
csy
s-te
ms
Shiji
eR
enU
nive
rsit
yof
Shef
field
3st
p07s
r@sh
ef.a
c.uk
Baye
sian
Clin
ical
Tria
ls
Jenn
ifer
Rog
ers
Uni
vers
ity
ofW
arw
ick
3J.K
.Rog
ers@
war
wic
k.ac
.uk
Surv
ival
Ana
lysi
s,R
ecur
rent
Even
ts
Vere
naR
olof
fM
RC
Bios
tati
stic
sU
nit
2ve
rena
.rol
off@
mrc
-bsu
.cam
.ac.
ukM
eta-
anal
ysis
Fran
cisc
oR
ubio
Uni
vers
ity
ofW
arw
ick
1F.
J.Rub
io@
war
wic
k.ac
.uk
Baye
sian
Stat
isti
cs,B
iost
atis
tics
Ala
stai
rR
ushw
orth
Uni
vers
ity
ofG
lasg
ow1
alas
tair
@st
ats.
gla.
ac.u
kSp
atia
land
envi
ronm
enta
lmod
ellin
g
127
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Fion
aSa
mm
utU
nive
rsit
yof
War
wic
k1
f.sam
mut
@w
arw
ick.
ac.u
kM
ulti
vari
ate
Ana
lysi
s,G
LMs
Ria
Sand
erso
nO
ffice
for
Nat
iona
lSta
tist
ics
NA
ria.
sand
erso
n@on
s.go
v.uk
Sam
ple
desi
gn&
esti
mat
ion,
mod
ellin
gte
chni
ques
Susa
nne
Schm
itz
Trin
ity
Col
lege
Dub
lin1
schm
itzs
@tc
d.ie
Baye
sian
Infe
renc
e
Javi
erSe
rrad
illa
New
cast
leU
nive
rsit
y3
javi
er.s
erra
dilla
@nc
l.ac.
ukG
auss
ian
Proc
esse
s,M
ulti
vari
ate
Stat
isti
cal
Proc
ess
Con
trol
,Fa
ctor
Ana
lysi
s
Gol
naz
Shah
tahm
asse
biU
nive
rsit
yof
Plym
outh
2go
lnaz
.sha
htah
mas
sebi
@pl
ymou
th.a
c.uk
Fina
ncia
lSta
tist
ics,
Baye
sian
mod
ellin
g,C
ompu
tati
onal
Stat
isti
cs
Yiqi
nSh
enU
nive
rsit
yof
War
wic
k1
yiqi
n.sh
en@
war
wic
k.ac
.uk
Brai
nim
agin
gan
alys
is
And
rew
Sim
pkin
Nat
iona
lUni
vers
ity
ofIr
elan
d,G
alw
ay3
a.si
mpk
in1@
nuig
alw
ay.ie
Smoo
thin
gan
dD
eriv
ativ
es
Saw
apor
nSi
ripa
ntha
naU
nive
rsit
yof
Shef
field
1sm
p08s
s@sh
effie
ld.a
c.uk
Stat
isti
calp
roce
ssco
ntro
l
And
rew
Smit
hU
nive
rsit
yof
Bris
tol
3A
ndre
w.D
.Sm
ith@
bris
tol.a
c.uk
Non
para
met
ric
regr
essi
on,I
mag
ean
alys
is
128
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Joan
naSm
ith
Uni
vers
ity
ofG
lasg
ow2
j.sm
ith@
stat
s.gl
a.ac
.uk
Shap
ean
alys
is
Mic
helle
Stan
ton
Lanc
aste
rU
nive
rsit
y3
m.s
tant
on@
lanc
aste
r.ac.
ukSp
atia
land
spat
io-t
empo
rale
pide
mio
logy
;tro
pica
ldis
ease
epid
emio
l-og
y
Nat
alie
Stap
linU
nive
rsit
yof
Sout
ham
pton
2nd
s105
@so
ton.
ac.u
kSu
rviv
alA
naly
sis
Kar
aN
icol
aSt
even
sU
nive
rsit
yof
Bris
tol
2ka
ra.s
teve
ns@
bris
tol.a
c.uk
Tim
eSe
ries
Ana
lysi
s
Ale
xand
erSt
raw
brid
geM
RC
Cam
brid
ge2
alex
ande
r.str
awbr
idge
@m
rc-b
su.c
am.a
c.uk
Mea
sure
men
tErr
or
Dav
idSu
daLa
ncas
ter
Uni
vers
ity
3d.
suda
@la
ncs.
ac.u
kSt
ocha
stic
Cal
culu
s,Ba
yesi
anIn
fere
nce,
Com
puta
tion
alSt
atis
tics
Jam
esSw
eene
yTr
init
yC
olle
geD
ublin
3sw
eene
ja@
tcd.
ieSp
atia
lsta
tist
ics,
mul
tidi
men
sion
alin
tegr
atio
n,N
onpa
ram
etri
cre
gres
-si
on
Sara
hTa
ylor
Lanc
aste
rU
nive
rsit
y1
s.ta
ylor
14@
lanc
aste
r.ac.
ukM
odel
ling
ofte
xtur
esus
ing
wav
elet
s
Ale
xand
reTh
iery
Uni
vers
ity
ofW
arw
ick
1a.
h.th
iery
@w
arw
ick.
ac.u
kM
onte
Car
lom
etho
ds-S
tati
stic
alph
ysic
s
129
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
How
ard
Thom
MR
CC
ambr
idge
1ho
war
d.th
om@
mrc
-bsu
.cam
.ac.
ukM
odel
Ave
ragi
ng,C
ostE
ffec
tive
ness
Ana
lysi
s
Mar
iaR
Thom
asQ
ueen
Mar
yU
nive
rsit
yLo
ndon
3m
dr@
mat
hs.q
mul
.ac.
ukBa
yesi
anst
atis
tics
and
dose
findi
ngin
clin
ical
tria
ls
Hel
enTh
orne
wel
lU
nive
rist
yof
Surr
ey3
h.th
orne
wel
l@su
rrey
.ac.
ukEx
peri
men
talD
esig
n
Tom
asTo
upal
Uni
vers
ity
ofW
estB
ohem
ia2
ttou
pal@
kma.
zcu.
czst
atis
tics
Mic
hael
Tsag
ris
Uni
vers
ity
ofN
otti
ngha
m1
pmxm
t1@
nott
ingh
am.a
c.uk
Rob
ustS
tati
stic
s
Elen
iVer
ykou
kiU
nive
rsit
yof
Not
ting
ham
2pm
xev@
nott
ingh
am.a
c.uk
Baye
sian
Stat
isti
cs,M
CM
C
Rou
ntin
aV
rous
aiTr
init
yC
olle
geD
ublin
2vr
ousa
ir@
tcd.
ieBa
yesi
anm
etho
dsfo
rsp
atia
l-te
mpo
rala
naly
sis
Jenn
yW
adsw
orth
Lanc
aste
rU
nive
rsit
y2
j.wad
swor
th@
lanc
aste
r.ac.
ukEx
trem
eva
lue
theo
ry;B
ayes
ian
met
hods
,esp
ecia
llyno
npar
amet
eric
s
Eva
Wag
nero
vaU
nive
rsit
yof
Wes
tBoh
emia
1ew
a@km
a.zc
u.cz
stat
isti
cs
130
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Nei
lWal
ker
Uni
vers
ity
ofBr
isto
l3
neil.
wal
ker@
fera
.gsi
.gov
.uk
Envi
ronm
enta
lSta
tist
ics
Chu
nW
ang
Uni
vers
ity
ofN
otti
ngha
m3
pmxc
w1@
nott
ingh
am.a
c.uk
Mat
hem
atic
alFi
nanc
e
Kev
inW
ilson
New
cast
leU
nive
rsit
y3
k.j.w
ilson
@nc
l.ac.
ukBa
yesi
anin
fere
nce,
Baye
slin
ear
met
hods
,exp
erim
enta
ldes
ign
Col
inW
orby
Uni
vers
ity
ofN
otti
ngha
m/H
PA1
colin
.wor
by@
hpa.
org.
ukIn
fect
ion
Mod
ellin
g,M
arko
vM
odel
s,M
CM
C
Ala
nW
righ
tU
nive
rsit
yof
Plym
outh
1al
an.w
righ
t3@
plym
outh
.ac.
ukG
enet
icEp
idem
iolo
gy
Yang
Xia
MR
CC
ambr
idge
1ya
ng.x
ia@
mrc
-bsu
.cam
.ac.
ukM
ulti
-sta
tem
odel
ling
and
surv
ival
anal
ysis
Tati
ana
Xif
ara
Lanc
aste
rU
nive
rsit
y1
t.xif
ara@
lanc
aste
r.ac.
ukM
CM
CM
etho
ds,B
ayes
ian
Stat
isti
cs
LeiY
anU
nive
rsit
yof
Not
ting
ham
2pm
xly1
@no
ttin
gham
.ac.
ukIm
age
Ana
lysi
s,St
ocha
stic
Proc
esse
s
Peng
Yin
New
cast
leU
nive
rsit
y1
peng
.yin
@nc
l.ac.
ukst
atis
tics
anal
ysis
ofm
issi
ngda
ta
131
Nam
eIn
stit
utio
nYe
arEm
ail
Res
earc
hIn
tere
sts
Yeun
gW
aiYi
n(W
inni
e)Q
ueen
Mar
y,U
nive
rsit
yof
Lond
on2
w.y
.yeu
ng@
qmul
.ac.
ukBi
ased
Coi
nD
esig
nin
clin
ical
Tria
ls
Ben
Youn
gman
Uni
vers
ity
ofSh
effie
ld3
b.yo
ungm
an@
shef
field
.ac.
ukEx
trem
eva
lue
theo
ry
Nur
Fati
hah
Mat
Yuso
ffN
atio
nalU
nive
rsit
yof
Irel
and,
Gal
way
2n.
mat
yuso
ff1@
nuig
alw
ay.ie
Stru
ctur
alEq
uati
onM
odel
ing,
Cor
resp
onde
nce
Ana
lysi
s,M
easu
re-
men
tErr
or,L
aten
tVar
iabl
e
Vyt
aute
Zab
arsk
aite
Uni
vers
ity
ofN
otti
ngha
m1
pmxv
z@no
ttin
gham
.ac.
ukSt
ocha
stic
proc
esse
sin
Mat
hem
atic
alFi
nanc
e
Piot
rZ
wie
rnik
Uni
vers
ity
ofW
arw
ick
3p.
w.z
wie
rnik
@w
arw
ick.
ac.u
kal
gebr
aic
and
geom
etri
cm
etho
dsin
stat
isti
cs,
mod
elid
enti
fiabi
lity,
asym
ptot
ics
unde
rno
n-re
gula
rsc
enar
ios
132
Best Talks and Poster
Prizes will be awarded to the three best talks and the best poster asvoted for by yourselves, the delegates.
Please use this page to vote for your favorite two talks and your fa-vorite poster and hand it in during lunchtime on Wednesday.
1st best talk:
2nd best talk:
Best poster:
Back of RSC 2010 voting slip
Department of StatisticsUniversity of WarwickCoventryCV4 7ALwww2.warwick.ac.uk/fac/sci/statistics/postgrad/rsc/2010/