136
. 33rd Research Students’ Conference in Probability and Statistics 12th -15th April 2010 Conference Proceedings

33rd Research Students' Conference in Probability and Statistics

  • Upload
    lehanh

  • View
    241

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 33rd Research Students' Conference in Probability and Statistics

.

33rd Research Students’ Conferencein Probability and Statistics

12th -15th April 2010Conference Proceedings

Page 2: 33rd Research Students' Conference in Probability and Statistics
Page 3: 33rd Research Students' Conference in Probability and Statistics

Timetable of Events

Monday 12th April

13:00 Registration of Delegates (The Street)

15:00 Afternoon Tea (The Street)

15:30 Opening Address & Plenary Session (MS.01, Maths/Stats Building)

15:30 Opening Address: Prof. Jane L. Hutton (University of Warwick)15:45 Plenary Talk I: Prof. Jim Q. Smith (University of Warwick)16:20 Plenary Talk II: Dr. Jonathan Rougier (University of Bristol)16:55 Announcements/Housekeeping

18:00 Dinner (Rootes Social Building)

19:00 Pub Quiz (Varsity Pub)

Tuesday 13th April

07:30 Breakfast (Rootes Social Building)

09:10 Session 1 (Math/Stats Building)

11:10 Refreshments (The Street)

11:30 Session 2 (Math/Stats Building)

13:30 Lunch (The Street)

14:30 Session 3 (Math/Stats Building)

16:10 Poster Session and Refreshments (The Street)

18:00 Dinner (Rootes Social Building)

19:00 Evening Entertainment (Coventry City Centre)19:00 Bus Collection by Students Union to Cross Point Business Park

(Bowling and Cinema)19:30 Bus Collection by Students Union to Town Hall (Pub Crawl)

3

Page 4: 33rd Research Students' Conference in Probability and Statistics

22:00 Bus Collection from Bowling and Cinema to Campus

23:30 First Bus Collection from Pub to Campus

00.30 Second Bus Collection from Pub to Campus

Wednesday 14th April

07:30 Breakfast (Rootes Social Building)

09:10 Session 4 (Math/Stats Building)

11:10 Refreshments (The Street)

11:30 Session 5 (Math/Stats Building)

13:30 Lunch (The Street)

14:30 Sponsors’ Talks (Math/Stats Building)

16:10 Sponsors’ Wine Reception (The Street)

18:15 Bus Collection to Conference Dinner (Coventry Transport Museum)

22:15 First Bus Collection to Campus

23:45 Second Bus Collection to Campus

Thursday 15th April

07:30 Breakfast (Rootes Social Building)

09:30 Delegates Depart

4

Page 5: 33rd Research Students' Conference in Probability and Statistics

Contents1 Welcome from the Organisers 7

2 The City and University 8

3 Campus Map 11

4 University Facilities 12

5 Accommodation 12

6 Conference Details 136.1 Meals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136.2 Sponsors’ Wine Reception . . . . . . . . . . . . . . . . . . . . . . . . . . 13

7 Help, Information and Telephone Numbers 147.1 Departmental Computing and Internet Access . . . . . . . . . . . . . . 14

8 Instructions 158.1 For Chairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158.2 For Speakers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158.3 For Displaying a Poster . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168.4 Prizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

9 Plenary Session 179.1 Professor Jane L. Hutton (University of Warwick) . . . . . . . . . . . . . 179.2 Professor Jim Q. Smith (University of Warwick) . . . . . . . . . . . . . . 189.3 Dr. Jonathan Rougier (University of Bristol) . . . . . . . . . . . . . . . . 19

10 List of Sponsors’ Talks 20

11 Talks Schedule 2111.1 Monday 12th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2111.2 Tuesday 13th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2211.3 Wednesday 14th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

12 Talk Abstracts by Session 3212.1 Tuesday 13th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

12.1.1 Session 1a: Image Analysis . . . . . . . . . . . . . . . . . . . . . 3212.1.2 Session 1b: Computational Statistics . . . . . . . . . . . . . . . . 3612.1.3 Session 1c: Operational Research . . . . . . . . . . . . . . . . . . 3912.1.4 Session 1d: Statistical Inference . . . . . . . . . . . . . . . . . . . 4212.1.5 Session 2a: Medical Statistics I . . . . . . . . . . . . . . . . . . . . 4512.1.6 Session 2b: Financial . . . . . . . . . . . . . . . . . . . . . . . . . 4812.1.7 Session 2c: Elicitation and Epidemiology . . . . . . . . . . . . . 5112.1.8 Session 2d: Multivariate Statistics . . . . . . . . . . . . . . . . . . 54

5

Page 6: 33rd Research Students' Conference in Probability and Statistics

12.1.9 Session 3a: Genetics . . . . . . . . . . . . . . . . . . . . . . . . . . 5612.1.10 Session 3b: Medical Statistics II . . . . . . . . . . . . . . . . . . . 5912.1.11 Session 3c: Dimension Reduction . . . . . . . . . . . . . . . . . . 6212.1.12 Session 3d: Environmental . . . . . . . . . . . . . . . . . . . . . . 65

12.2 Wednesday 14th April . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6812.2.1 Session 4a: Medical Statistics III . . . . . . . . . . . . . . . . . . . 6812.2.2 Session 4b: Point Processes and Spatio-temporal Statistics . . . 7112.2.3 Session 4c: General . . . . . . . . . . . . . . . . . . . . . . . . . . 7412.2.4 Session 4d: Graphical Models and Extreme Value Theory . . . . 7712.2.5 Session 5a: Experimental Design and Population Genetics . . . 7912.2.6 Session 5b: Censoring in Survival Data and Non-Parametric

Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8212.2.7 Session 5c: Time Series and Diffusions . . . . . . . . . . . . . . . 8512.2.8 Session 5d: Probability . . . . . . . . . . . . . . . . . . . . . . . . 8812.2.9 Session 6a: Sponsors’ Talks . . . . . . . . . . . . . . . . . . . . . 9112.2.10 Session 6b: Sponsors’ Talks . . . . . . . . . . . . . . . . . . . . . 9212.2.11 Session 6c: Sponsors’ Talks . . . . . . . . . . . . . . . . . . . . . . 93

13 Poster Abstracts by Author 94

14 RSC 2011: Cambridge University 103

15 Sponsors’ Advertisements 104

16 RSC History 115

17 Delegate List 116

6

Page 7: 33rd Research Students' Conference in Probability and Statistics

1 Welcome from the Organisers

Welcome to the 33rd Research Students’ Conference in Statistics and Probability (RSC2010). This year the conference is hosted by the University of Warwick. The RSC isan annual event aiming to provide postgraduate statisticians and probabilists withan appropriate forum to present their research. This four day event is organised bypostgraduates, for postgraduates, providing an excellent opportunity to make con-tacts and discuss work with other students, who have similar interests.

For many students this will be your first experience of presenting your work, withsome of you also taking the opportunity to chair a session. For those of you attendingand not presenting, we hope that you will benefit greatly from observing others andnetworking with researchers working in a similar field.

Finally, we will be looking for potential hosts for RSC 2012. If you think yourinstitution would be keen to take part in such an exciting project, please let us know.Next year the conference will be held in Cambridge.

Mouna Akacha, Flavio Goncalves, Bryony Hill and Jennifer RogersConference Organisers

7

Page 8: 33rd Research Students' Conference in Probability and Statistics

2 The City and University

The University of Warwick is one of the leading UK research universities and isranked number 1 in the Midlands. Consistently ranked in the top ten of UK uni-versities, it is an entrepreneurial institution that has a large positive impact on localand regional communities. The University is located in the heart of England, 3 miles(5 kilometres) from Coventry city centre, on the border with Warwickshire.

Coventry

Coventry is a city and metropolitan borough in the county of West Midlands in Eng-land. Coventry is the 9th largest city in England and the 11th largest in the UnitedKingdom. It is also the second largest city in the English Midlands, after Birmingham.

Coventry is situated 95 miles (153 km) northwest of London and 19 miles (30 km)east of Birmingham, and is farthest from the coast of any city in Britain. Althoughharbouring a population of almost a third of a million inhabitants, Coventry is notamongst the English Core Cities Group, due to its proximity to Birmingham.

Coventry was also the world’s first ‘twin’ city when it formed a twinning relation-ship with the Russian city of Stalingrad (now Volgograd) during World War II. Therelationship developed through ordinary people in Coventry who wanted to showtheir support for the Soviet Red Army during the Battle of Stalingrad. The city isnow twinned with Dresden and with 27 other cities around the world.

Coventry Cathedral is one of the newer cathedrals in the world, having beenbuilt following the World War II bombing of the ancient cathedral by the Luftwaffe.Coventry motor companies have contributed significantly to the British motor indus-try, and it has two universities, the city centre-based Coventry University as well asthe University of Warwick on the southern outskirts.

In the late 19th century, Coventry became a major centre of bicycle manufacture,with the industry being pioneered by Rover. By the early 20th century, bicycle man-ufacture had evolved into motor manufacture, and Coventry became a major centreof the British motor industry. Over 100 different companies have produced motorvehicles in Coventry, but car production came to an end in 2006 as the last car rolledoff the lines at Peugeot’s Ryton plant. Production was transferred to a new plantnear Trnava, Slovakia, with the help of EU grant aid to Peugeot: this made Peugeotdeeply unpopular in the city. The design headquarters of Jaguar Cars is still in thecity at their Whitley plant and although they ceased vehicle assembly at their BrownsLane plant in 2004, they still continue some operations from there.

A major visitor attraction in Coventry city centre is the free-to-enter CoventryTransport Museum, which has the largest collection of British-made road vehicles inthe world and will be the venue for our Conference Dinner. The most notable exhibitsare the world speed record-breaking cars, Thrust2 and ThrustSSC. The museum re-ceived a major refurbishment in 2004 which included the creation of a striking newentrance as part of the city’s Phoenix Initiative project. The revamp saw the museumexceed its projected five-year visitor numbers within the first year alone, and it was

8

Page 9: 33rd Research Students' Conference in Probability and Statistics

a finalist for the 2005 Gulbenkian Prize.The most famous daughter of Coventry is Lady Godiva. Her ride through the

streets of the city has passed into legend. According to the popular story, Lady Go-diva took pity on the people of Coventry, who were suffering grievously under herhusband’s oppressive taxation. Lady Godiva appealed again and again to her hus-band, who obstinately refused to remit the tolls. At last, weary of her entreaties, hesaid he would grant her request if she would strip naked and ride through the streetsof the town. Lady Godiva took him at his word and, after issuing a proclamation thatall persons should stay indoors and shut their windows, she rode through the town,clothed only in her long hair. Today a statue positioned in the heart of the city centreis reminding of her braveness.

The University

The establishment of the University of Warwick was given approval by the govern-ment in 1961 and received its Royal Charter of Incorporation in 1965. It straddles theboundary between the City of Coventry and the County of Warwickshire. The ideafor a university in Coventry was mooted shortly after the conclusion of the SecondWorld War, but it was a bold and imaginative partnership of the City and the Countywhich brought the University into being on a 400 acre site jointly granted by the twoauthorities. Since then, the University has incorporated the former Coventry Collegeof Education in 1978 and has extended its land holdings by the purchase of adjoiningfarm land.

The University initially admitted a small intake of graduate students in 1964 andtook its first 450 undergraduates in October 1965. In October 2009, the student pop-ulation was over 21,598 of which around 9008 are postgraduates. 25% of the studentbody comes from overseas and over 125 countries are represented on the campus.The University has 29 academic departments and over 50 research centres and insti-tutes, in four Faculties: Arts, Medicine, Science and Social Sciences. The Universityhosts two HEFCE Centres for Excellence in Learning and Teaching (CETLs): CAPI-TAL and Reinvention. The new Medical School took its first students on an innova-tive 4-year accelerated postgraduate programme in September 2000. In summer 2004the first 64 students graduated from the school. In October 2004 the combined intakeof the Warwick Medical School was 403, making it one of the largest in the country.

From its beginnings, the University has sought to be excellent in both teachingand research. It has now secured its place as one of the UK’s leading research univer-sities, confirmed by the results of the government’s Research Assessment Exercisessince 1986. In all of these, Warwick has been placed in the top half dozen or so of uni-versities for the quality of its research. The results of the 2008 Research AssessmentExercise (RAE) again reiterate Warwick’s position as one of the UK’s leading researchuniversities, with Warwick ranked at 7th overall in the UK (based on multi-facultyinstitutions).

The University of Warwick campus was recently voted the best campus in the UK.It’s a lively, cosmopolitan place with its own shops, banks, bars and restaurants - an

9

Page 10: 33rd Research Students' Conference in Probability and Statistics

exciting place to live and work with everything you could need close at hand. Thereis a great sense of community at Warwick: The campus is home to students and stafffrom over 120 different countries and from all backgrounds, and is a great resourcefor the local community with excellent facilities such as Warwick Arts Centre andthe University Sports Centre. The campus is continually developing; in August 2008Warwick Digital Laboratory was opened by Prime Minister Gordon Brown, and theindoor tennis centre at Westwood campus was opened in March 2008. The campusis situated on three adjacent sites: Central campus, Gibbet Hill campus and West-wood campus. There are lakes and woods, trees and landscaped gardens but whilstthe campus has many green open spaces, inside the buildings ground-breaking re-search is taking place and academics and students are sharing their knowledge andexperience.

The Department

The Department of Statistics at the University of Warwick is one of the largest UKconcentrations of researchers in statistics and probability, and the synergy betweenprobabilistic and statistical research is particularly strong. The research environmentis vibrant, with a large and active community of PhD students and postdoctoralresearchers, excellent library, computing and other research support facilities, andsustained programmes of research seminars, workshops and international visitors.There are strong research links with other disciplines both at Warwick and externally.

Research related activities (seminars, workshops, visitors, etc.) take place mainlythrough three long-term initiatives: CRiSM (Centre for Research in Statistical Method-ology), P@W (Probability at Warwick) and RISCU (Risk Initiative and Statistical Con-sultancy Unit). CRiSM is funded by EPSRC and HEFCE, as well as Warwick, as anational Science and Innovation investment. P@W is a focus for inter-departmentalprobability research at Warwick and for the organisation of externally open researchworkshops and training events in probability, while RISCU provides resources fordeveloping applied research collaborations with industry, commerce, governmentand other outside bodies, and with other academic disciplines.

The Department’s research ranges from probability theory, through computationand statistical methodology, to substantive applications in many different fields. Inthe most recent national Research Assessment Exercise (RAE 2008), the Departmenthad 70% of its activity rated as internationally excellent (grade 3* or higher), withmore than a quarter classed as world leading (grade 4*). For publications by membersof the department, please see individual staff web pages.

The Department leads the EPSRC-funded Academy for PhD Training in Statistics,a collaboration with eight other prominent UK research groups to organise intensivecourses for first-year PhD students. From 2010 a further new feature of our PhD pro-vision is the EPSRC-funded MASDOC initiative for doctoral training at the interfacebetween statistics and applied mathematics.

10

Page 11: 33rd Research Students' Conference in Probability and Statistics

3 Campus Map

HE

ALT

H C

EN

TR

E R

OA

D

11

1

2

4

5

6

8

79

1012

13

15

16

14

37

34

36

44

33 28

51

6670

17

19 69

57

65

38

59

58

67

67

40

20

60

50

31

30

2362

46

64

47

27

43

45

42

18

26 35

24

32

3953

63

61

54

52

5649

71

4825

48

55

22

29

21

68

3

41

A B C D E F G H

1

2

3

4

5

6

7

8

9

SYMBOLS University Buildings

Student Residences

Car Parks

Building Entrances

For the most up-to-date version of this map go to warwick.ac.uk/go/mapsFor further information see the University web site or mobile site www.m.warwick.ac.uk

BUILDING KEY

Wheelchair Accessible Entrances

Controlled Access

Footpaths

Footpaths/Cycleways

One way Road

Bus Stop

No Entry

International Automotive Research Centre (IARC)..........1 ......E4Arden ...........................................................................2 ...... F2Argent Court, incorporating Estates, AdsFab & Jobs.ac.uk ......................................3 ..... G3Arthur Vick ...................................................................4 ...... F6Avon Building, incorporating Drama Studio ...................5 ..... G2Benefactors ..................................................................6 ..... C5Biological Sciences .......................................................7 ......D8Biomedical Research ....................................................8 ......D8Gibbet Hill Farmhouse ..................................................9 ..... C8Chaplaincy .................................................................10 ......D5Chemistry ...................................................................11 ......D4Claycroft .....................................................................12 ..... G5Computer Science ......................................................13 ......E4Coventry House ..........................................................14 ......D5Cryfield, Redfern & Hurst ............................................15 ......B5Dining & Social Building Westwood ............................16 ..... G2Education, Institute of, incorporating Multimedia CeNTRE & TDA Skills Test Centre .............17 ..... H2Engineering ................................................................18 ......E4Engineering Management Building ..............................19 ...... F2Games Hall .................................................................20 ......E2Gatehouse ..................................................................21 ......D3Health Centre .............................................................22 ......D6Heronbank .................................................................23 ......A4Humanities Building ....................................................24 ......E4International House .....................................................25 ..... C6International Manufacturing Centre .............................26 ......E4IT Services Elab level 4 ...............................................27 ..... H2IT Services levels 1-3 ..................................................28 ..... H2Jack Martin .................................................................29 ......E6Lakeside .....................................................................30 ......B3Lakeside Apartments ..................................................31 ......B2Library ........................................................................32 ......D4Lifelong Learning ........................................................33 ..... G2Medical School Building..............................................34 ......D8Mathematics & Statistics (Zeeman Building) ................35 ...... F4Maths Houses ............................................................36 ......E8

Medical Teaching Centre ............................................37 ......D8Millburn House ...........................................................38 ...... F3Modern Records Centre & BP Archive ........................39 ......D5Music .........................................................................40 ..... H2Nursery .......................................................................41 ..... C3Physical Sciences .......................................................42 ......D4Physics .......................................................................43 ......D4Porters & Postroom ....................................................44 ..... G1Psychology .................................................................45 ......E5Radcliffe .....................................................................46 ..... C4Ramphal Building .......................................................47 ......D4Rootes .......................................................................48 C6/D6Rootes Building ..........................................................49 ..... C5Scarman .....................................................................50 ..... C3Science Education ......................................................51 ..... H2Shops .........................................................................52 ......D5Social Sciences ..........................................................53 ......D4Sports Centre .............................................................54 ......E5Sports Pavilion ............................................................55 ......A5Students’ Union ..........................................................56 ......D5Tennis Centre .............................................................57 ...... F2Tocil ............................................................................58 ...... F5University House, incorporating Learning Grid ............59 ......E2Vanguard Centre .........................................................60 ..... G3Warwick Arts Centre, incorporating Music Centre .......61 ......D5Warwick Business School (WBS) ................................62 ......D4

WBS Main Reception, Scarman Rd ........................62 ......D3WBS Social Sciences .............................................63 ......D5WBS Teaching Centre ............................................64 ..... C4

Warwick Digital Laboratory .........................................65 ...... F4WarwickPrint ..............................................................66 ..... H2Westwood ..................................................................67 G1/G2Westwood Gatehouse OCNCE ...................................68 ..... H2Westwood House, incorporating Occupational Health, Counselling & DARO Calling Room .............................69 ..... G2Westwood Teaching and Westwood Lecture Theatre..70 ..... H2Whitefields ..................................................................71 ......D5

A full-size version of the map is provided in the Conference pack.

11

Page 12: 33rd Research Students' Conference in Probability and Statistics

4 University Facilities

Everything you will need during your stay can be found on the University campus.Situated on 700 acres of rural parkland, the campus ’village’ environment has its ownbanks, bars, shops and outlets.

All meals - breakfasts, lunches, dinners and morning/afternoon refreshments-are included in the conference registration. However, if you find yourself still hungrythere are a number of bars and cafes open around campus and also a small Costcuttersupermarket located next to the Student Union. Inside Costcutter there is also a PostOffice and Copyshop (for printing, photocopying and binding).

A 10-minute walk takes you to the local Tesco, Boots and Iceland at Cannon ParkShopping Centre. Coventry’s high street stores are a bus-ride away, as is LeamingtonSpa’s range of boutique and high street shops.

The Student Union building (possibly the largest in Europe) has recently been re-built in a 11 million redevelopment project. As well as a new entertainments venue,there are also more spaces for those who just want to go out and have a drink, includ-ing the new pub ’The Dirty Duck’ which serves its own local ale, and ‘The TerraceBar’ which looks out over the Piazza. Downstairs in the Union are branches of twomajor UK banks - Barclays and Natwest- and also a pharmacy and hair salon, shouldyou need them!

If you are coming by rail or bus (e.g., National Express or Megabus), you shouldcome to Coventry. Travel Coventry service number 12 (which display the destination:University of Warwick or Leamington Spa) run from the city centre bus station (PoolMeadow), via Coventry Rail Station, to the University Central Campus, passing theWestwood campus en route.

Free car parking is available for all delegates staying on campus. You can requestan access code for car parks 7, 8 and 15 (see campus map) from Rootes Social Buildingreception when you check in.

5 Accommodation

Accommodation is in en-suite rooms on campus, 5 mins walk from both the Math/StatsDepartment and Rootes Social Building where breakfast and dinners will be served.

All rooms have towels and toiletries. Kitchen facilities are available although allmeals are provided.

Internet is available in all bedrooms. Details of how to log onto the system willbe displayed in each individual bedroom, but delegates will need to bring their ownEthernet cable. These can be purchased from Rootes Reception should anyone not bein possession of one.

Rooms will be available after 15:00 for check in, however luggage can be left atRootes Reception in Rootes Social Building until this time. All bedrooms must bevacated by 9:30am on Thursday 15th.

12

Page 13: 33rd Research Students' Conference in Probability and Statistics

6 Conference Details

On Monday 12th, delegates should arrive at the Math/Stats Building (Zeeman Build-ing) between 13:00 and 15:00 to register and collect conference packs. These containall the information needed during the conference. If you are presenting a poster,please submit it at registration. The conference will open with the plenary session at15:30 in the Math/Stats Department.

On Tuesday 13th and Wednesday 14th, delegates will have the opportunity topresent talks. Posters will be on display in The Street of the Math/Stats Buildingthroughout the afternoon of Tuesday 13th, with the poster session commencing at16:10. Presenters are encouraged to be near their posters during this session in orderto answer questions from interested participants.

6.1 Meals

Breakfasts and evening meals (except on the evening of the conference dinner) willbe served in Rootes Restaurant on campus. Lunches and morning/afternoon refresh-ments will be served in Math/Stats Department where the conference will be held.

Please note that on the first day of the conference (Mon 12th) we will not be pro-viding any lunch. However there are plenty of eating facilities available on campus,and tea, coffee and cakes will be served before the plenary session.

Dinner on the Wednesday evening will be at Coventry Transport Museum. Youwill be expected to wear formal attire (no jeans or trainers please). Before the mealyou will be given an opportunity to have a look around the museum, and afterwardsthere will be a Ceilidh, followed by a DJ.

Coaches to the conference dinner will pick delegates up by the Students Union at18:15.

6.2 Sponsors’ Wine Reception

The Sponsors’ Reception will be held in The Street in the Maths/Stats Building onthe Wednesday at 16:10, prior to the conference dinner. Please take this opportunityto talk with our sponsors and visit their displays to learn more about possible careeropportunities.

13

Page 14: 33rd Research Students' Conference in Probability and Statistics

7 Help, Information and Telephone Numbers

Department address:Dept of StatisticsUniversity of WarwickCoventryCV4 7ALTelephone: 024 7657 4812Fax: 024 7652 4532

Emergency Numbers:University Security: 024 7652 2083 (also for general emergencies)Conference Organiser: 077 2998 4952 (Jennifer Rogers, resident on campus)

Transport:Swift Taxis Coventry: 024 7676 7676Trinity Street Taxis: 024 7663 1631Bus information: 0871 200 2233National Rail Enquiries: 08457 484950

7.1 Departmental Computing and Internet Access

Free wireless internet access will be available to all delegates in The Street area ofthe Maths/Stats building. You will be given the username and password in order toaccess this service via your laptops after the Plenary Session.

14

Page 15: 33rd Research Students' Conference in Probability and Statistics

8 Instructions

8.1 For Chairs

• Please arrive at the appropriate seminar room five minutes before the start ofyour session. Familiarise yourself with the visual equipment.

• Packs will be left in each seminar room. Do not remove the packs or any of theircontents from the seminar room. If you think something might be missing fromthe pack, please contact one of the organisers.

• You should clearly introduce yourself and each speaker in turn.

• It is very important that we stick to the schedule. Therefore please start thesession on time, use the time remaining cards, and make sure that questions arenot allowed to delay the rest of the session.

• If a speaker fails to show, please advise the audience to attend a talk in an alter-native seminar room. Do not move the next talk forward.

• After each talk, thank the speaker, encourage applause, and open the floor toquestions (from students only). If no questions are forthcoming, ask one your-self.

• Use the 5 min and 1 min flash cards to assist the speaker in finishing on time.

8.2 For Speakers

• Each seminar room will contain a computer, data projector and white/blackboard.

• Arrive five minutes before the start of the session, introduce yourself to thechair and load your presentation onto the computer.

• Presentations must be pdf or Powerpoint (ppt or pptx) files. No other format isacceptable.

• Talks are strictly fifteen minutes plus five minutes for questions. Anyone goingover this time will be asked to stop by the chair.

• Your chair will let you know when you have five minutes and then one minuteremaining for your presentation.

15

Page 16: 33rd Research Students' Conference in Probability and Statistics

8.3 For Displaying a Poster

• The poster session will be held in The Street area of the Math/Stats Building at16:10 on Tuesday 13th April.

• Please submit posters upon registration on Monday 12th April.

• Posters will be erected by conference organisers.

• During the poster session, it is advisable to be near your poster in order toanswer questions from interested participants.

• Posters will also be displayed throughout Tuesday afternoon.

• Please ensure that your poster is removed by 17:30 on Tuesday.

• Posters should be of no greater size then A1.

8.4 Prizes

The three best talks and the best poster, as voted for by all delegates, will receiveprizes in the form of book vouchers from our sponsors CUP and Wiley-Blackwelland additionally, courtesy of the Royal Statistical Society:

The RSS will offer the best three presentations and the best poster from theRSC2010 conference the opportunity to present their work at the RSS2010conference which will be held from 13-17 September in Brighton. Thethree best presentations will participate in a special session at the confer-ence and the poster will be presented alongside the other posters at theevent. The prize will be in the form of free registration at the conferencefor the four winners. (The registration fee includes many meals and socialevents but not transport or accommodation).

Further details about the conference can be found at: www.rss.org.uk/rss2010

16

Page 17: 33rd Research Students' Conference in Probability and Statistics

9 Plenary Session

9.1 Professor Jane L. Hutton (University of Warwick)

Opening Address

Jane L. Hutton is a Professor of Statistics in the Department of Statistics, University ofWarwick. She works in medical statistics, with special interests in survival analysis,meta-analysis and non-random data. Accelerated failure time models are a particu-lar focus in her research in survival analysis. She has major collaborations in cerebralpalsy and epilepsy. Her work with Professor Peter Pharoah and Dr Allan Colver, onlife expectancy in cerebral palsy, has had a substantial effect on the size of awardsin medico-legal cases. This work is widely cited nationally and internationally. Inepilepsy, she has contributed to many Cochrane reviews of anti-epileptic drugs. Sheis currently working on a research project with Dr Tony Marson, of Liverpool Uni-versity Neurosciences Department. She has written extensively on ethics and philos-ophy of statistics. She has contributed to Research Council ethics guidelines.

17

Page 18: 33rd Research Students' Conference in Probability and Statistics

9.2 Professor Jim Q. Smith (University of Warwick)

Title: How to do Research Creatively

Abstract

Making the shift from being a taught student to a researcher is a challenging one.We all develop the skill to deliver to our teachers what they want to see in ex-ams. Now suddenly we must develop a completely distinct set of skills wherethe point of our work is to produce something *different* from what other re-searchers do. How can this transition to becoming a creative researcher in Statis-tics or Probability be managed? In this short talk I will outline some techniques Ihave developed over the years: some of which I hope you might find useful.

Jim Q. Smith is a Professor of Statistics at Warwick University and has researched awide range of topics both theoretical and applied, but always Bayesian. He is cur-rently Chair of RISCU, the consultancy arm of the statistics department and has closeresearch ties with various companies and government departments.

18

Page 19: 33rd Research Students' Conference in Probability and Statistics

9.3 Dr. Jonathan Rougier (University of Bristol)

Title: Complex systems: Accounting for model limitations

Abstract

Many complex systems, notably environmental systems like climate, are highlystructured, and numerical models, known as simulators, play an important rolein prediction and control. It is crucial to account for limitations in simulators,since these can be substantial, and can vary substantially from one simulator toanother. These limitations can be categorised in terms of input uncertainty, para-metric uncertainty, and structural uncertainty. The talk explains this framework,and the particular challenge of accounting for simulator limitations in dynamicalsystems, with illustrations from climate science and natural hazards.

Jonty Rougier is an applied statistician working in the area of computer experiments,particularly for complex environmental systems like climate. He studied Economicsand then Statistics at Durham, the latter as a postdoc working with Michael Gold-stein and Allan Seheult. He is currently a Lecturer in Statistics in the Department ofMathematics at the University of Bristol.

19

Page 20: 33rd Research Students' Conference in Probability and Statistics

10 List of Sponsors’ Talks

On Wednesday 14th several of the conference sponsors will be giving presentationsas part of the main conference programme, providing an opportunity to learn abouttheir statistical work.

Session 6a, Room MS.01, Chair: Jennifer Rogers

Time Sponsor Speaker Title Pg

14:30 International Bio-metric Society

Richard Ems-ley

The International Biometric Society:What can it offer to PostgraduateStudents?

91

15:05 Pfizer Phil Wood-ward

Bayesian Design & Analysis of Ex-periments

91

15:40 SmartOdds Robert Mas-trodomenico

An Introduction to Football Mod-elling at Smartodds

91

Session 6b, Room MS.04, Chair: Mouna Akacha

Time Sponsor Speaker Title Pg

14:30 Shell Wayne Jones Making Decisions with Confidence -Statistics the Shell Way

92

15:05 AHL, Man GroupPLC

Martin Lay-ton

An Introduction to AHL 92

Session 6c, Room MS.05, Chair: Flavio B Goncalves

Time Sponsor Speaker Title Pg

15:05 Royal StatisticalSociety

HelenThornewell

Support from the RSS and theirYoung Statisticians Section

93

15:40 Lloyds BankingGroup

Bill Fite Opportunities in Probability andStatistical Modelling at Lloyds Bank-ing Group Decision Science

93

20

Page 21: 33rd Research Students' Conference in Probability and Statistics

11 Talks Schedule

11.1 Monday 12th April

Session – PlenaryChair: Jennifer RogersRoom: MS.01, Maths/Stats Building

Time Speaker Title Pg

15:30 Hutton, Jane L. Opening Address 1715:45 Smith, Jim Q. How to do Research Creatively 1816:20 Rougier, Jonathan Complex systems: Accounting for model limita-

tions19

21

Page 22: 33rd Research Students' Conference in Probability and Statistics

11.2 Tuesday 13th April

Session 1a: Image AnalysisChair: Bryony HillRoom: MS.01

Time Speaker Title Pg

09:10 Doshi, Susan Statistical image reconstruction for cone-beamcomputed tomography

32

09:35 Fallaize, Christopher Matching Shapes of Different Sizes 3310:00 Khatun, Mahmuda Morphological Granulometry for Image Texture

Analysis and Classification34

10:25 Yan, Lei Statistical Threshold of Magnetoencephalo-graphic (MEG) Data

34

10:50 Llewelyn, Stephanie Statistical Modelling of Fingerprints 35

Session 1b: Computational StatisticsChair: Flavio B GoncalvesRoom: MS.04

Time Speaker Title Pg

09:10 Cainey, Joe Performance of Pseudo-Marginal MCMC Algo-rithms

36

09:35 O’Hagan, Adrian Computational Advances in Fitting MixtureModels via the EM Algorithm

36

10:00 Prangle, Dennis Summary statistics for Approximate BayesianComputation

37

10:25 Raychaudhuri, Clare Investigating methods to approximate the ex-pectation efficiently

38

10:50 Vrousai, Dina Sampling from the posterior- MCMC, Impor-tance resampling or Numerical integration?

39

22

Page 23: 33rd Research Students' Conference in Probability and Statistics

Session 1c: Operational ResearchChair: Fiona SammutRoom: MS.05

Time Speaker Title Pg

09:10 Anacleto-Junior, Os-valdo

Bayesian forecasting models for traffic manage-ment systems

39

09:35 Aslett, Louis JM Modelling and Inference for Networks with Re-pairable Redundant Subsystems

40

10:00 May, Benedict Multi-Armed Bandit with Regressor Problems 4010:25 Moffatt, Joanne Analysing strategy in the sprint race in track cy-

cling using logistic regression41

10:50 Hashim, Siti R.M. Interpretation Problems in Multivariate ControlChart

42

Session 1d: Statistical InferenceChair: Stephen BurgessRoom: A1.01

Time Speaker Title Pg

09:10 Jamalzadeh, Amin Developing Effect Sizes for Non-Normal Data 4209:35 Jesus, Joao Inference without likelihood 4310:00 McElduff, Fiona Maximum likelihood estimation of discrete dis-

tribution parameters using R43

10:25 Ogundimu, Em-manuel

Investigating the impact of missing data onCronbach’s alpha estimates and Confidence In-tervals

44

10:50 Zwiernik, Piotr Posets, Mobius functions and tree-cumulants 44

23

Page 24: 33rd Research Students' Conference in Probability and Statistics

Session 2a: Medical Statistics IChair: Mouna AkachaRoom: MS.01

Time Speaker Title Pg

11:30 Ewings, Sean Modelling Blood Glucose Concentration for Peo-ple with Type 1 Diabetes

45

11:55 Smith, Joanna Methods for the Analysis of Asymmetry 4612:20 Strawbridge, Alexan-

derMeasurement error correction of the associa-tion between fasting blood glucose and coronaryheart disease - a structural fractional polynomialapproach

46

12:45 Verykouki, Eleni Modelling the effects of antibiotics on carriagelevels of MRSA

47

13:10 Roloff, Verena Planning future studies based on the conditionalpower of a random-effects meta-analysis

48

Session 2b: FinancialChair: Murray PollockRoom: MS.04

Time Speaker Title Pg

11:30 Lapinski, Tomasz Modelling the rank system with Gibbs, Bose Ein-stein or Zipf Law. Application in MathematicalFinance

48

11:55 Michelbrink, Daniel A Martingale Approach to Active Portfolio Se-lection

49

12:20 Pham, Duy Measuring vega risks of Bermudan swaptionsunder the Markov-Functional model

49

12:45 Shahtahmassebi,Golnaz

Mathematical and Statistical Models for Predict-ing Financial Behaviour

50

13:10 Wang, Chun An optimal stopping problem of finite horizonwith regime switching

50

24

Page 25: 33rd Research Students' Conference in Probability and Statistics

Session 2c: Elicitation and EpidemiologyChair: Michelle StantonRoom: MS.05

Time Speaker Title Pg

11:30 Elfadaly, Fadlalla G. On Eliciting Expert Opinion in Generalized Lin-ear Models

51

11:55 Noosha, Mitra Discordancy between the prior and data usingconjugate priors

51

12:20 Ford, Ashley P. Indian Buffet Epidemics. A Bayesian Approachto Modelling Heterogeneity

52

12:45 Worby, Colin A hidden Markov model to analyse MRSA trans-mission in hospital wards

53

13:10 Walker, Neil Estimating the size of a badger population usinglive capture and post-mortem data

53

Session 2d: Multivariate StatisticsChair: Nathan HuntleyRoom: A1.01

Time Speaker Title Pg

11:30 Fayomi, Aisha Cauchy Principal Components Analysis 5411:55 Sweeney, James Approximate Joint Statistical Inference for Large

Spatial Datasets54

12:20 Tsagris, Michael Multivariate outliers, the forward search and theCronbach’s Reliability Coefficient

55

12:45 Mohammad, Rofizah Bayesian Analysis in Multivariate Data 5513.10 Sammut, Fiona Some Aspects of Compositional Data 56

25

Page 26: 33rd Research Students' Conference in Probability and Statistics

Session 3a: GeneticsChair: Dennis PrangleRoom: MS.01

Time Speaker Title Pg

14:30 Evangelou, Marina Incorporating available biological knowledge toexplore genome-wide association data

56

14:55 Fowler, Anna Informed Bayesian Clustering of Gene Expres-sion Levels

57

15:20 Burgess, Stephen An application of Bayesian techniques forMendelian randomization to assess causality ina large meta-analysis

58

15:45 Cairns, Jonathan BayesPeak: A Hidden Markov Model foranalysing ChIP-seq experiments

59

Session 3b: Medical Statistics IIChair: Helen ThornewellRoom: MS.04

Time Speaker Title Pg

14.30 Hee, Siew Wan Designing a Series of Phase II Trials 5914:55 Magirr, Dominic Response-Adaptive Block Randomization in Bi-

nary Endpoint Clinical Trials60

15:20 Ren, Shijie Bayesian clinical trial designs for survival out-comes

60

15:45 Yeung, Wai Yin The power of the biased coin design for clinicaltrials

61

26

Page 27: 33rd Research Students' Conference in Probability and Statistics

Session 3c: Dimension ReductionChair: James SweeneyRoom: MS.05

Time Speaker Title Pg

14:30 Chand, Sohail Oracle properties of Lasso-type methods in Re-gression problems

62

14:55 Khan, Md. HasinurRahaman

Penalized Weighted Least Squares Variable Se-lection Method for AFT Models with High Di-mensional Covariates

62

15:20 Serradilla, Javier Latent Variable Models for Process Monitoring 6315:45 Yusoff, Nur Fatihah

MatA study of item selection using principal compo-nent analysis and correspondence analysis

63

Session 3d: EnvironmentalChair: Andrew SmithRoom: A1.01

Time Speaker Title Pg

14:30 Jones, Emma M. Using a Bayesian Hierarchical Model for Tree-Ring Dating

65

14:55 Norris, Beth Not another species richness estimator?! 6515:20 Oxlade, Rachel Uncertainty analysis for multiple ecosystem

models using Bayesian emulators66

15:45 Powell, Helen Estimating biologically plausible relationshipsbetween air pollution and health

67

27

Page 28: 33rd Research Students' Conference in Probability and Statistics

11.3 Wednesday 14th April

Session 4a: Medical Statistics IIIChair: Fiona McElduffRoom: MS.01

Time Speaker Title Pg

09:10 Iglesias, Alberto Al-varez

An application of survival trees to the study ofcardiovascular disease

68

09:35 Dooley, Cara Analysis of an Observational Study to in Col-orectal Cancer Patients

68

10:00 O’Keeffe, Aidan Causal Inference in Longitudinal Data Analysis:A Case Study in the Epidemiology of PsoriaticArthritis

69

10:25 Thomas, MariaRoopa

Design and analysis of dose escalation trials 70

10:50 Nicholls, Stuart Modelling parental decisions for newbornbloodspot screening

70

Session 4b: Point Processes and Spatio-Temporal StatisticsChair: Chris FallaizeRoom: MS.04

Time Speaker Title Pg

09:10 Marek, Patrice Poisson Process Parameter Estimation from Datain Bounded Domain

71

09:35 Bakar, KhandokerShuvo

A Comparison of Bayesian Space-Time Modelsfor Ozone Concentration Levels

72

10:00 Proctor, Iain Multi-level models for ecological response appli-cations

72

10:25 Stanton, Michelle A Spatio-temporal modelling of Meningitis Inci-dence in sub-Saharan Africa

73

10:50 Smith, Andrew Denoising UK House Prices 73

28

Page 29: 33rd Research Students' Conference in Probability and Statistics

Session 4c: GeneralChair: Michael TsagrisRoom: MS.05

Time Speaker Title Pg

09:10 Gollini, Isabella Mixture of Latent Trait Analyzers 7409:35 Klapper, Jennifer A wavelet based approach to HPLC data analy-

sis74

10:00 Bhattacharya, Sakya-jit

Delete-Replace Identity For A Set Of Indepen-dent Observations

75

10:25 Sanderson, Ria Modelling Main Contractor Status for the NewOrders Survey

75

10:50 Wilson, Kevin Bayes linear kinematics in the analysis of failurerates

76

Session 4d: Graphical Models and Extreme Value TheoryChair: Guy FreemanRoom: A1.01

Time Speaker Title Pg

09:10 Wadsworth, Jenny Uncertainty in Choice of Measurement Scale forExtreme Value Analysis

77

09:35 Youngman, Ben Modelling extremal phenomena using differentdata sources

77

10:00 Byrne, Simon Parametrisation of graphical models 7810:25 Caimo, Alberto Bayesian inference for Social Network Models 78

29

Page 30: 33rd Research Students' Conference in Probability and Statistics

Session 5a: Experimental Design and Population GeneticsChair: Andrew SimpkinRoom: MS.01

Time Speaker Title Pg

11:30 Khadim, MudakkarM.

Canonical Analysis of Multi-Stratum ResponseSurface Designs & Standard Errors of Eigenval-ues

79

11:55 Martin, Kieran D-optimal design of experiments for a dynamicmodel with correlated observations

79

12:20 Thornewell, Helen Vulnerability: A 2nd Criterion to Distinguish be-tween Equally-Optimal BIBDs

80

12:45 Kershaw, Emma Surfing In One Dimension 8013:10 Mair, Colette Dimension Reduction for Human Genomic SNP

Variation81

Session 5b: Censoring in Survival Data and Non-Parametric StatisticsChair: Jennifer RogersRoom: MS.04

Time Speaker Title Pg

11:30 Elsayed, Hisham Ab-del Hamid

Parametric Survival Model with Time-dependent Covariates for Right CensoredData

82

11:55 Staplin, Natalie Assessing the Effect of Informative Censoring inPiecewise Parametric Survival Models

83

12:20 Thom, Howard Dealing with Censoring in Quality AdjustedSurvival Analysis and Cost Effectiveness Anal-ysis

83

12:45 Aboalkhair, AhmadM

Nonparametric Predictive Inference for SystemReliability

84

13:10 Toupal, Tomas Nonparametric Estimation of Reliability of TwoRandom Variables Using Kernel Estimation ofDensity

85

30

Page 31: 33rd Research Students' Conference in Probability and Statistics

Session 5c: Time Series and DiffusionsChair: Alexander StrawbridgeRoom: MS.05

Time Speaker Title Pg

11:30 Bhattacharya, Arnab Sequential Integrated Nested Laplace Approxi-mation

85

11:55 Killick, Rebecca Finding changepoints in a Gulf of Mexico hurri-cane hindcast dataset

86

12:20 Stevens, Kara Prediction Intervals of the Local Spectrum Esti-mate

87

12:45 Suda, David Discrete- and Continuous-time Approaches toImportance Sampling on Diffusions

87

13:10 Villalobos, IsadoraAntoniano

Bayesian inference for diffusions based on exactsimulation

88

Session 5d: ProbabilityChair: Duy PhamRoom: A1.01

Time Speaker Title Pg

11:30 Barranon, Antonio A.Ortiz

A New Bivariate Generalized Pareto Model 88

11:55 Huntley, Nathan Backward Induction and Subtree Perfectness 8912:20 Lee, Rui Xin On the Convergence of Continuously Monitored

Barrier Options Under Markov Processes89

12:45 Wagnerova, Eva Distortion of Probability Models 90

31

Page 32: 33rd Research Students' Conference in Probability and Statistics

12 Talk Abstracts by Session

12.1 Tuesday 13th April

12.1.1 Session 1a: Image Analysis

Session Room: MS.01Chair: Bryony Hill

Start time 09:10

STATISTICAL IMAGE RECONSTRUCTION FOR CONE-BEAM

COMPUTED TOMOGRAPHYSusan Doshi and Chris Jennison

University of Bath, UK

Keywords: Bayesian image analysis, Cone-beam CT, Image-guided radiotherapy

In image-guided radiotherapy, the accuracy of patient positioning is determined us-ing images of internal anatomy in addition to the traditional external markers. Thisgives confidence that radiation prescribed for the treatment of cancer will be deliv-ered to the desired volume. Treatment is usually delivered five days a week for sev-eral weeks, with imaging used on many of these occasions. X-ray cone-beam com-puted tomography (CBCT) is increasingly being used for this purpose. An X-raysource moves in a circular trajectory around the patient and planar projection imagesare acquired at increments of 1◦. The data in these images are used to reconstruct a3D representation of the patient.Conventional Fourier-based reconstruction techniques rely on relatively noise-freeprojection images, with the entire patient diameter being included in each projec-tion, and with a complete set of projections over more than 180◦. Satisfying each ofthese requirements can be difficult. In addition, metallic fiducial markers may be im-planted to help track the movement of soft tissues. These improve visualisation onthe projection images, but may cause artefacts in the 3D reconstruction.Statistical reconstruction techniques can cope naturally with these obstacles. In thispresentation, we will introduce the Bayesian approach to image reconstruction. Mod-elling may be carried out in a number of spaces: the 2D projection image, the 3D pa-tient space, or the 3D sinogram space (formed by ’stacking’ the 2D projections alonga third axis indexed by the projection angle). We can use a normal likelihood, orinclude aspects of the physical system in a more realistic model. Inference on thestructure of the patient is based on MCMC sampling from the posterior distribution,and choices of prior and likelihood are made by considering the trade-off betweenaccurate inference and the time taken to perform this sampling. The methods will bedemonstrated using data acquired on clinical systems.

32

Page 33: 33rd Research Students' Conference in Probability and Statistics

Start time 09:35

MATCHING SHAPES OF DIFFERENT SIZESChristopher Fallaize

University of Leeds, UK

Keywords: Bayesian alignment, MCMC, Scale factor, Statistical shape analysis, Unlabelledlandmarks

The shape of an object is the information invariant under the full similarity trans-formations of rotation, translation and rescaling. In statistical shape analysis, we areconcerned with analysing differences in shape between individual objects or popula-tions. To this end, we first seek some optimal registration which removes the effectsof orientation, location and size, so that any remaining differences are due to genuinedifferences in shape.Objects are often reduced to k points, known as landmarks, inm dimensions and thuscan be represented as k ×m point configurations. In labelled shape analysis the cor-respondence between landmarks on different configurations is known. Unlabelledshape analysis deals with the more complex situation where the correspondence be-tween landmarks is unknown. Green and Mardia (Biometrika, 2006 pp. 235–254)developed a Bayesian methodology for the pairwise alignment of two unlabelledconfigurations using the rigid body transformations of rotation and translation.We present the extension to full similarity shape by introducing a scaling factor to themodel. Taking one of the configurations as a fixed reference, the aim is to estimate thetransformation of the other configuration onto the reference whilst simultaneouslyidentifying the matching between landmarks. Particular challenges include efficientsimulation from a non-standard distribution for the scale factor and the desire for asymmetrical setup to ensure that equal inferences are drawn regardless of which con-figuration is taken as the reference. Possible applications include automated imageanalysis (where objects nearer or further away have different sizes) and biologicalmorphometrics (where objects at different growth stages may be of different sizes).We shall illustrate our methodology with examples using both real and artificial datasets.

33

Page 34: 33rd Research Students' Conference in Probability and Statistics

Start time 10:00

MORPHOLOGICAL GRANULOMETRY FOR IMAGE TEXTURE

ANALYSIS AND CLASSIFICATIONMahmuda Khatun1, Dr Alison Gray1 and Prof. Steve Marshall2

1 Department of Mathematics and Statistics, University of Strathclyde, Glasgow2 Department of Electronic and Electrical Engineering, University of Strathclyde,

GlasgowKeywords: Image analysis, Morphology, Opening, Granulometry, Pattern spectrum,

Structuring element

An important area of digital image analysis is the analysis of texture images. A sta-tistical approach to image texture classification based on granulometric moments isdescribed here. Mathematical morphology provides a set of non-linear techniques toextract shaped-based information from an image, using image probes in the form of‘structuring elements’. Opening granulometry is based on a sequence of morpholog-ical openings using scaled structuring elements. As the scale increases, more imageareas are removed. Pattern spectra are formed by normalising the removed area bythe total image area. Since the pattern spectrum is a probability density function itsmoments can be calculated. The pattern spectrum moments can be used as texturefeatures for classification.This work concerns sequences of texture images which evolve in time, and the clas-sification of a new image to a point in time. Statistical models are being built to relategranulometric moments to evolution time directly, using training images for whichboth the evolution parameters and the time state are known. Each model can be usedfor back-prediction of evolution time of a new image from its observed granulometricmoments. Better predictions are expected by combining different models.

Start time 10:25

STATISTICAL THRESHOLD OF

MAGNETOENCEPHALOGRAPHIC (MEG) DATALei Yan, C.J. Brignell and C. D. Litton

School of Mathematical Sciences, University of Nottingham, UK

Keywords: FWER, Random field, Permutation method

In this presentation, we show how Magnetoencephalographic (MEG) data can beanalyzed statistically using parametric (standard and random field) and nonpara-metric methods (permutation, bootstap). Compared to parametric statistical tests,nonparametric statistical tests provide complete freedom to the user with respect tothe test statistic by means of which the experimental conditions are compared. We

34

Page 35: 33rd Research Students' Conference in Probability and Statistics

propose statistical thresholds that control the familywise error rate (FWER) acrosstime or across both space and time. These approaches use the distribution of teststatistics under the null hypothesis to find FWER thresholds. We show the originalpermutation tests can not control FWER while experimental conditions have samevariance-covariance structure, which is difficult to achieve in practice. Unlike pre-vious permutation based tests in neuroimaging, we also address the problem by apermutation based tests without assumption that different experimental conditionshave same variance-covariance structure.

Start time 10:50

STATISTICAL MODELLING OF FINGERPRINTSStephanie Llewelyn

University of Sheffield, UK

Keywords: Fingerprints, Identification, Modelling

It is believed that fingerprints are determined in embryonic development. Unlikeother personal characteristics the fingerprint appears to be a result of a random pro-cess. For example fingerprints of identical twins (whose DNA is identical) are dis-tinct, and extensive studies have found little evidence of a genetic relationship interms of types of fingerprint, certainly at the small scale. At a larger scale the patternof ridges on fingerprints can be categorised as belonging to one of five basic forms:loops (left and right), whorls, arches and tented arches. The population frequenciesof these types show little variation with ethnicity and a list of the types occurring onthe ten digits can be used as an initial basis for identification of individuals. However,such a system would not uniquely identify an individual although the frequency ofcertain combinations could be extremely small. At a smaller scale various minutiaeor singularities can be observed in a fingerprint. These include ridge endings andbifurcations, amongst others. Typical fingerprints have several hundred of these aswell as two key points (with the exception of a simple arch) referred to as the coreand delta, which are focal points of the overall pattern of ridges. Modern identifica-tion systems are based upon endings and bifurcations, not least because they are theeasiest to determine automatically from image analysis. The configuration of theseminutiae is unique to the individual.The presentation will outline the history of use of fingerprints, illustrate some of thesefeatures used for identification and discuss ways in which statistical models could bedeveloped to generate realistic fingerprints using data obtained from fingermarks.

35

Page 36: 33rd Research Students' Conference in Probability and Statistics

12.1.2 Session 1b: Computational Statistics

Session Room: MS.04Chair: Flavio B Goncalves

Start time 09:10

PERFORMANCE OF PSEUDO-MARGINAL MCMCALGORITHMS

Joe CaineyStatistics Group, University of Bristol

Keywords: MCMC, Latent Variable, Pseudo-Marginal, Metropolis-Hastings, GIMH,Autocorrelation

Given the problem of sampling from a distribution π (θ), the Metropolis-Hastings(MH) algorithm is often used to generate a Markov Chain with invariant distributionπ (θ). In cases where π (θ) is intractable, or too complex to evaluate, a different ap-proach must be taken. It is often possible to instead construct a Markov Chain withinvariant distribution π (θ, z), where z can be missing data, or latent variables whichmake π (θ, z) easier to evaluate, which is known as data augmentation. A pseudo-marginal algorithm attempts to combine the precision of the marginal sampler withthe computational efficiency of data augmentation techniques.Grouped Independence Metropolis-Hastings (GIMH) is a pseudo-marginal algorithmwhich uses importance sampling to estimate π (θ). When running any form of MCMCsampler, the performance of the resulting chain is of great importance. We show thatas the number of importance sampling particles approaches infinity the performanceof the chain produced by the GIMH algorithm converges to that of the marginal al-gorithm.

Start time 09:35

COMPUTATIONAL ADVANCES IN FITTING MIXTURE

MODELS VIA THE EM ALGORITHMAdrian O’Hagan

University College Dublin, Ireland

Keywords: Expectation-Maximisation Algorithm, Starting values, Multimodal likelihoodfunctions, Convergence rate, Multicycle ECM Algorithm

The Expectation-Maximisation (EM) Algorithm is a popular tool for deriving maxi-mum likelihood estimates in a large family of statistical models. Chief among its at-tributes is the property that the algorithm always drives the likelihood uphill. How-ever it can be difficult to assess convergence and, in the case of multimodal likelihood

36

Page 37: 33rd Research Students' Conference in Probability and Statistics

functions, the algorithm may become trapped at a local maximum.We introduce a variety of schemes to promote algorithmic efficiency. A range of”burn-in” functions are described. These can produce initialising values for the EMalgorithm of a higher quality than those arising from simply employing randomstarts. The use of likelihood monitoring and multicycle features allows maximiza-tion steps to be ordered and targeted on parameter subsets. Outcomes are comparedwith those from the model-based clustering package mclust in R where a hierarchi-cal clustering initialisation is performed. The overall goal is to increase convergencerates to the global likelihood maximum and/or to attain the global maximum in ahigher percentage of cases.

Start time 10:00

SUMMARY STATISTICS FOR APPROXIMATE BAYESIAN

COMPUTATIONDennis Prangle

Lancaster University

Keywords: ABC, MCMC, Bayesian statistics

Approximate Bayesian Computation (ABC) methods are a family of algorithms for‘likelihood-free’ Bayesian inference. The domain of use is models where numericalevaluation of the likelihood is impossible or impractical, but from which data can eas-ily be simulated. For example, over the last decade ABC has allowed investigation ofrealistic but previously intractable models in population genetics. Other applicationsinclude infectious disease epidemiology and missing data models.ABC operates by simulating data Xsim from the model of interest for many param-eter values θ and constructing an approximation to the posterior from those θ valuesfor which the associated Xsim closely matches the observations Xobs. Algorithmshave been proposed which implement this idea within the frameworks of rejectionand importance sampling, Markov Chain Monte Carlo and Sequential Monte Carlo.A key insight in past research is that to achieve practical acceptance rates, ‘closenessof match’ should be judged by some norm ||S(Xsim)− S(Xobs)||where S(.) are lowdimensional summary statistics of a data set. However the problem of how to chooseS well is an open question in the literature.This talk uses visual examples to introduce the main ideas of ABC and describe anovel methodology for constructing efficient summary statistics. Theoretical supportfor the method is also briefly outlined.

37

Page 38: 33rd Research Students' Conference in Probability and Statistics

Start time 10:25

INVESTIGATING METHODS TO APPROXIMATE THE

EXPECTATION EFFICIENTLYClare Raychaudhuri

University of Bristol, UK

Keywords: variance reduction, Monte-Carlo methods

Suppose we wish to estimate the expectation of a function g (x) with respect to thestandard Gaussian distribution, i.e. the Gaussian distribution with mean 0 and vari-ance the identity matrix. One method to estimate this expectation is to use basicMonte-Carlo methods 1

n

∑g (xi). However basic Monte-Carlo methods may require

large number of function evaluations for the estimate to converge. Luckily it is of-ten possible to speed up this convergence using control variates. In order to use acontrol variate it is required that there exists a function α (x) for which the expecta-tion is known, E {α (x)} = c, and which has a strong correlation with g (x). This newestimator µ for E {g (x)} is

µ =1

n

{n∑i=1

g (xi) + [c− α (xi)] B

}.

The variance of this estimator is minimised when B = B∗

B∗ = Var {α (X)}−1 Cov {α (X) , g (X)} .

However often B∗ is not known so it has to be estimated using linear regression. Inthe case where α (X) = X and so c = 0 this problem is equivalent to estimating theintercept of a linear regression of (1,X) on Y. Unfortunately this is a biased estimatorof E {g (x)} since the same data points are used both to estimate B and to estimate µ.Therefore a method such as jack-knife should be applied to reduce the estimator biasand provide an estimate of the variance of the estimator.While using linear regression is appropriate when n � q + p, (where p is the dimen-sion of y and q is the dimension of x), it is not appropriate if there is only a smallsample size n. In this case dimensional reduction techniques such as principle com-ponent analysis or partial least squares analysis can be considered.

38

Page 39: 33rd Research Students' Conference in Probability and Statistics

Start time 10:50

SAMPLING FROM THE POSTERIOR- MCMC, IMPORTANCE

RESAMPLING OR NUMERICAL INTEGRATION?Dina Vrousai and John Haslett

Trinity College Dublin, Ireland

Keywords: Numerical Integration, MCMC

Many methods and algorithms have been developed to sample from the posteriordistribution. Importance resampling (IR) and particularly Markov Chain Monte Carlo(MCMC) methods are widely used for this purpose. Sampling from the posterior us-ing these methods doesn’t require the knowledge of the normalizing constant. An-other alternative is to compute the normalizing constant and then to sample from theposterior. This can be very computationally demanding, especially in high dimen-sional problems.We are using an R package, lately released, which implements multidimensional in-tegration algorithms, only for Riemann integrals (unit hypercube). The aim is tocompare the special characteristics of these three methods (IR, MCMC, Numericalintegration) using an application on blood lactate data. We are using Kriging withGaussian processes to model these data. We then compare the posterior distributionsfor our model obtained using these three different methods (MCMC, IR and Numer-ical integration).

12.1.3 Session 1c: Operational Research

Session Room: MS.05Chair: Fiona Sammut

Start time 09:10

BAYESIAN FORECASTING MODELS FOR TRAFFIC

MANAGEMENT SYSTEMSOsvaldo Anacleto-Junior and Dr. Catriona Queen

The Open University, Department of Mathematics and Statistics

Many roads have real-time traffic flow data available which can be used as part of atraffic management system. In a traffic management system, traffic flows are moni-tored over time with the aim of reducing congestion by taking actions, such as impos-ing variable speed limits or diverting traffic onto alternative routes, when problems

39

Page 40: 33rd Research Students' Conference in Probability and Statistics

arise. Reliable short-term forecasting models of traffic flows are crucial for monitor-ing traffic flows and, as such, are crucial to the ultimate success of any traffic man-agement system.The model used here for forecasting traffic flows uses a directed acyclic graph (DAG)in which the nodes represent the time series of traffic flows at the various data collec-tion sites, and the links between nodes represent the conditional independence andcausal structure between flows at different sites. The DAG breaks the multivariatemodel into simpler univariate components, each one being a dynamic linear model.This makes the model computationally simple, no matter how complex the trafficnetwork is, and allows the forecasting model to work in real-time, as required by anytraffic management system.This talk will report current research in the development of this class of model withparticular reference to a busy motorway junction in the UK.

Start time 09:35

MODELLING AND INFERENCE FOR NETWORKS WITH

REPAIRABLE REDUNDANT SUBSYSTEMSLouis JM Aslett and Simon P Wilson

Trinity College Dublin, Ireland

Keywords: Bayesian inference, reliability theory, phase-type distributions,telecommunications, MCMC

We consider the problem of modelling the reliability of a network of subsystemswhere each subsystem has redundancy and is repairable. The motivation for thiswork is large-scale telecommunications networks.The time to failure of the subsystem hardware is modeled by an appropriate Markovprocess and is hence a phase-type distribution. The network structure defines a fail-ure rule in terms of the states of the subsystems, allowing computation by MonteCarlo simulation of the time to failure distribution for the network. When data onthe reliability of the subsystems are available, this can be incorporated via modifica-tions to an existing Bayesian inference approach to update the prediction of networkreliability.

Start time 10:00

MULTI-ARMED BANDIT WITH REGRESSOR PROBLEMSBenedict May and Dr. David Leslie

University of Bristol, UK

Keywords: Bandit Problem, Reinforcement Learning, Linear Regression, NonparametricRegression

40

Page 41: 33rd Research Students' Conference in Probability and Statistics

The multi-armed bandit problem is a simple example of the exploitation/explorationtrade-off generally inherent in reinforcement learning problems. An agent is taskedwith learning from experience how to sequentially make decisions in order to max-imize average reward. In the extension considered, the agent is presented with aregressor before making each decision. The agent has to balance the tendency toexplore apparently sub-optimal actions (in order to improve regression function es-timates) against the tendency to exploit the current estimates (in order to maximisereward). Study of several past approaches to similar problems has indicated particu-lar desirable properties for the policy used. These properties motivate the choice andstudy of the algorithm that features in this work. The theoretical properties of thealgorithm have been studied and it has been tested on both linear and nonparametricregression problems. The intuitive algorithm has useful convergence properties and,compared to many conventional methods, performs well in simulations.

Start time 10:25

ANALYSING STRATEGY IN THE SPRINT RACE IN TRACK

CYCLING USING LOGISTIC REGRESSIONJoanne Moffatt1, Philip Scarf1, Louis Passfield2 and Ian McHale1

1 Centre for Operations Management, Management Science and Statistics, SalfordBusiness School, University of Salford, UK

2 Centre for Sports Studies, University of Kent, UKKeywords: Individual sprint race, Track cycling, Strategies, Logistic regression

Competitors and coaches in sports continually try to gain a competitive edge by op-timising strategy. One highly tactical contest is the individual sprint in track cycling,where one small strategic error can potentially cost the competitor the race. Theaim of this research is to use statistical analysis to give insight into strategies in thisevent. Eight logistic regression models were developed to predict the probability ofthe leading rider winning from different stages of the race, based on how the raceproceeded just before each stage. Logistic regression was selected since it is suitableto use when there are a large number of potential strategies. It also has the advantageof being simple to implement and straightforward to interpret. Key strategies weresuccessfully identified from the models, including how the leading rider can defendtheir lead and how the following rider optimises their chances of overtaking.

41

Page 42: 33rd Research Students' Conference in Probability and Statistics

Start time 10:50

INTERPRETATION PROBLEMS IN MULTIVARIATE CONTROL

CHARTSiti R.M. Hashim

University of Sheffield, UK

Keywords: multivariate control chart, multivariate processes, quality control, diagnosticmethod, correlation

Multivariate control charts have assumed a major role in multivariate processes qual-ity control. Unlike univariate control charts, the interpretation of the out-of controlsignals triggered from a multivariate control chart is not an easy straight forwardtask. Practitioners and quality control researchers have proposed a few diagnosticmethods to deal with this problem. Unfortunately, most of the proposed methodsdo not perform similarly under different type of mean shifts and correlation. As aresult, different diagnostic methods adopted might lead to different interpretationsand conclusions. In this study, a few diagnostic methods are selected and tested un-der different type of mean shifts and correlations. The performances of the selecteddiagnostic methods are measured by the percentage of correct identification with re-spect to the different mean shifts and correlations. A general guideline will be givenwith respect to the selection of the appropriate diagnostic methods in interpreting thesignals produced by multivariate control chart.

12.1.4 Session 1d: Statistical Inference

Session Room: A1.01Chair: Stephen Burgess

Start time 09:10

DEVELOPING EFFECT SIZES FOR NON-NORMAL DATAAmin Jamalzadeh

Durham University, UK

Keywords: Effect size, hypothesis test, two sample t-test, Normal distribution, Weibulldistribution

The classical hypothesis testing model seeks to determine whether to reject the hy-pothesis of the non-existence of a phenomenon. Therefore, statistical significancedoes not necessarily provide information about the importance or magnitude of thephenomenon. There are indicators, known as effect sizes (ES), which are used bysome to quantify the degree to which a phenomenon exists. Statistical significance

42

Page 43: 33rd Research Students' Conference in Probability and Statistics

is not a direct measure of ES, but there exists a functional relationship between thesample size, the ES and the p-value. For this reason, if the sample size is sufficientlylarge even a weak ES may appear as statistically significant. The ES has been mainlyintroduced and investigated based on an assumption of normal distribution for theunderlying population. However, there are many circumstances where the popula-tions are non-Normal, or depend on scale and shape and not just location parameter.We will review how to interpret the effect size for the two independent sample com-parison studies when the assumption of normality holds. We will also investigatehow results change when the parameters of location and scale both change for a nor-mal population. We introduce explorations for effect sizes for phenomena in whichthe variable follows a distribution with shape and scale parameters. As a special case,power analysis and sample size determinations will be discussed for continuous anddiscrete Weibull distributions for two sample comparison. Finally, for an application,we show how to detect the effect of some factors on the amount of time spent and thenumber of pages viewed while a user surfs on an E-commerce website.

Start time 09:35

INFERENCE WITHOUT LIKELIHOODJoao Jesus

University College London, UK

Keywords: Estimating Functions, Method of Moments, Efficiency, Minimal Variance,Simulation, Rainfall

Maximum likelihood estimation has been shown to be optimal for numerous classesof statistical models. However there are still many cases for which is not possibleto derive a likelihood, and where traditionally moment based inference is used. Theaim of this talk is to show some asymptotic results for moment based estimatorsincluding consistency and efficiency. We investigate the validity of the asymptoticresults for finite samples using simulations, the particular processes chosen are froma class of models for rainfall based on point-processes which are widely present inrainfall modeling literature, and are also used by official bodies like the UK ClimateImpacts Programme.

Start time 10:00

MAXIMUM LIKELIHOOD ESTIMATION OF DISCRETE

DISTRIBUTION PARAMETERS USING RFiona McElduff, Mario Cortina-Borja and Angie Wade

Centre for Paediatric Epidemiology and Biostatistics, Institute of Child Health, UCL.

Keywords: discrete distributions, maximum likelihood estimation, rapid estimation

43

Page 44: 33rd Research Students' Conference in Probability and Statistics

Value inflation, truncation and overdispersion frequently appear in discrete datasets.The most widely used model for discrete data is the Poisson distribution, but inpractice the equal mean-variance assumption is often not supported by the obser-vations. Many probability distribution functions have been developed to improvemodelling highly skewed variables. It is of interest to fit several models correspond-ing to competing data-generating mechanisms hypotheses. We have developed anR library to fit a comprehensive range of probability distributions to discrete datausing maximum likelihood estimation. The library includes models characterised asparameter-mix Poisson distributions and members of the Lerch and generalized Hy-pergeometric families, as well as their modified versions, e.g. those incorporatingvalue-inflation and truncation. Models are compared using the BIC. We apply thismethodology to several datasets within the field of child health research.

Start time 10:25

INVESTIGATING THE IMPACT OF MISSING DATA ON

CRONBACH’S ALPHA ESTIMATES AND CONFIDENCE

INTERVALSEmmanuel Ogundimu

University of Warwick, UK

Cronbach’s alpha is widely used to describe reliability of tests and measurements.Point estimates of Cronbach’s alpha are readily computed by statistical software, andmethods for constructing confidence intervals have also been suggested in the lit-erature. However, both point estimates and confidence intervals of Cronbach’s al-pha can give misleading results when data is missing. We demonstrate in a MonteCarlo study the impact of missing data on point estimates and confidence intervalsfor Cronbach’s alpha when items in tests have homogeneous or heterogeneous co-variance, and when an underlying normality assumption holds or is violated for testitems. In particular, we assess the coverage rates of Cronbach’s alpha Exact, Nor-mal theory (NT) and Asymptotic Distribution Free (ADF) intervals. Four methods ofimputing missing items scores were evaluated. Finally, we recommend the ‘best’ im-putation techniques for test developers to use when their data falls within scenariosdescribed in this study.

Start time 10:50

POSETS, MOBIUS FUNCTIONS AND TREE-CUMULANTSPiotr Zwiernik

University of Warwick, UK

Keywords: partially ordered sets, cumulants, model identifiability, bayesian networks withhidden variables, phylogenetic tree models, binary data

44

Page 45: 33rd Research Students' Conference in Probability and Statistics

It has been noted by several authors that in the case of multivariate models cumulantsoften form a convenient system of coordinates. We investigate Bayesian networks onrooted trees where all variables in the system are binary and the inner nodes repre-sent hidden variables. We show that in this case we can construct a more flexiblechange of coordinates. This change depends on classical results in the theory of par-tially ordered sets, which mirrors the combinatorial definition of cumulants.The new coordinates give us a good understanding of the structure of the modelsunder consideration. The nice structure of the parameterization allows us for exam-ple to understand the identifiability issues for this class of models: the formulae forthe estimators in the case when the model is identified and the structure of the MLEfibers in the case when it is not.

12.1.5 Session 2a: Medical Statistics I

Session Room: MS.01Chair: Mouna Akacha

Start time 11:30

MODELLING BLOOD GLUCOSE CONCENTRATION FOR

PEOPLE WITH TYPE 1 DIABETESSean Ewings

University of Southampton, UK

Type 1 diabetes mellitus is a chronic metabolic disorder which affects millions of peo-ple worldwide. It is characterised by loss of insulin-production mechanisms whichresults in prolonged high blood glucose concentration (hyperglycaemia). Day-to-day treatment is the responsibility of the individual and is based on injections ofinsulin. Insulin requirements are assessed daily according to various lifestyle fac-tors, predominantly diet and exercise. Poor control of the illness is associated withmany short- and long-term health complications such as ketoacidosis, cardiovascularevents (heart disease, stroke) and neuropathy. Diabetes UK (DUK) currently supportsa three-year study to investigate and model the effect of physical activity on capillaryblood glucose concentration. Volunteers to the study have blood glucose and exercise(as metabolic equivalent of task, MET) recorded continuously over a number of days.Food and insulin regimes are also recorded. Previous research provides models forthe action of ingested carbohydrate and injected insulin in the blood. These modelsmay be combined with the information on METs in order to investigate the behaviourof blood glucose concentration. The focus is on a descriptive model that can aid cur-rent treatment and hence limit complications. Currently, various time series modelsincluding the Dynamic Linear Models are investigated.

45

Page 46: 33rd Research Students' Conference in Probability and Statistics

Start time 11:55

METHODS FOR THE ANALYSIS OF ASYMMETRYJoanna Smith

University of Glasgow, UK

Keywords: shape analysis, asymmetry, landmarks

There is interest in knowing the extent of asymmetry present in the breasts of pa-tients who have undergone a unilateral mastectomy and reconstruction procedure.Three-dimensional images were captured for 44 such patients, and each case wasthen marked with ten anatomically significant landmarks. Asymmetry can be quan-tified as the degree to which there is a mismatch between a landmark configuration(the set of all landmarks on an individual image) and its relabelled and matchedreflection. After a configuration has been reflected, rotated and scaled to minimisesums of squares distances between corresponding landmarks we should have re-moved any location, orientation and size effects and be left purely with the genuineshape differences. This can be quantified into an asymmetry score for each patient.These asymmetry scores give an indication of the overall asymmetry present in acase, however it is also possible to examine what factors are contributing to this asym-metry as well. We can assess how much of the asymmetry that is present is due to thelocation, orientation and size of the reconstructed breast separately. It follows thatany asymmetry remaining after these transformations is due to a difference in theactual shape of the breasts, or an ‘intrinsic asymmetry’. It is also desirable to examineasymmetry over the whole surface of the breasts, rather than just the landmarks. Inorder to do this, we create a set of comparable points across all breasts, so that they allhave the same number of points which are in corresponding positions. Then, after re-flection, the asymmetry can be quantified by calculating the distances between thesecorresponding points on the reconstructed and unreconstructed breast. The shapedifferences between the two breasts can also be examined by a principal componentsanalysis.

Start time 12:20

MEASUREMENT ERROR CORRECTION OF THE

ASSOCIATION BETWEEN FASTING BLOOD GLUCOSE AND

CORONARY HEART DISEASE - A STRUCTURAL

FRACTIONAL POLYNOMIAL APPROACHAlexander Strawbridge

MRC Biostatistics Unit, Cambridge

Keywords: measurement error, fractional polynomials, regression calibration, epidemiology

46

Page 47: 33rd Research Students' Conference in Probability and Statistics

Some epidemiological variables such as height and weight may be assumed to bemeasured precisely however others such as blood pressure, blood glucose or foodintake may be subject to substantial measurement error.Fractional polynomials are widely used in epidemiological studies to model contin-uous non-linear exposure-response relationships but measurement error can lead toserious bias in the parameter estimates in our models. Regression calibration is an in-tuitive and easily implemented method for modelling the relationship between trueexposure and observed exposure when repeat measurements are available.We show how fractional polynomials and regression calibration can be combined toproduce a model that is corrected for the bias induced by measurement error. Wethen illustrate this method on a dataset looking at the association between fastingblood glucose and the risk of coronary heart disease events and show that measure-ment error may be leading us to underestimate the risk associated with higher thannormal levels of blood glucose.

Start time 12:45

MODELLING THE EFFECTS OF ANTIBIOTICS ON CARRIAGE

LEVELS OF MRSAEleni Verykouki

University of Nottingham, UK

Keywords: Markov Models, Maximum Likelihood, MCMC

Methicillin-Resistant Staphylococcus Aereus (MRSA) is a bacterium that is usuallyfound on the skin and in the nose. Once it enters the body it becomes harmful asit is resistant to antibiotics and is one of the most serious causes of nosocomial andsurgical site infections. In the project we are interested in assessing the effect of an-tibiotics of MRSA on data taken from a hospital study in London. A discrete-timeMarkov chain model is used to describe the daily MRSA carriage level in patients.Frequentist and Bayesian inference for the model parameters is drawn via maximumlikelihood and MCMC methods respectively. We validate our methodology usingsimulated data and then we fit our model to the real data (obtained from the abovestudy). Finally, we discuss how chi-square tests can be used to assess the goodnessof fit.

47

Page 48: 33rd Research Students' Conference in Probability and Statistics

Start time 13:10

PLANNING FUTURE STUDIES BASED ON THE

CONDITIONAL POWER OF A RANDOM-EFFECTS

META-ANALYSISVerena Roloff and Julian Higgins

MRC Biostatistics Unit, Cambridge, UK

Keywords: Random-effects meta-analysis, conditional power, sample size, information size,heterogeneity

Systematic reviews like those produced by The Cochrane Collaboration often providerecommendations for further research. When meta-analyses are inconclusive, suchrecommendations typically argue for further studies to be conducted. However, thenature and amount of future research should depend in the nature and amount of theexisting research. We propose a method based on conditional power to make theserecommendations more specific. Assuming a random-effects meta-analysis model,we evaluate the influence of the number of additional studies, of their informationsizes and of the heterogeneity anticipated among them on the ability of an updatedmeta-analysis to detect a pre-specified effect size. The conditional powers of possibledesign alternatives can be summarized in a simple graph which can also be the basisfor decision making. An example from literature is used to demonstrate our strategy.We find that if heterogeneity is anticipated, it might not be possible for a single studyto reach the desirable power no matter how large it is.

12.1.6 Session 2b: Financial

Session Room: MS.04Chair: Murray Pollock

Start time 11:30

MODELLING THE RANK SYSTEM WITH GIBBS, BOSE

EINSTEIN OR ZIPF LAW. APPLICATION IN

MATHEMATICAL FINANCETomasz Lapinski

University of Warwick, UK

Rank systems frequently occur in areas such as linguistics, physics, economy andfinance therefore their structure varies significantly. Existing modelling approaches

48

Page 49: 33rd Research Students' Conference in Probability and Statistics

have been developed and introduced separately to meet the needs of the particulardiscipline.However, it turns out, that for the particular rank system, which has not been ex-plored before, we are able to combine the existing approaches and then determinewhich of distributions is the most appropriate: Gibbs, Bose-Einstein or Zipf Law, as-suming that in real life such system obeys the maximum entropy principle.Particularly, this approach could be used in the financial mathematics, for the choiceof optimal portfolio of assets.

Start time 11:55

A MARTINGALE APPROACH TO ACTIVE PORTFOLIO

SELECTIONDaniel Michelbrink

The University of Nottingham, UK

Keywords: active portfolio selection, martingales, expected utility maximisation, geometricBrownian motion

An active portfolio selection problem is considered where an investor is interested inoutperforming a benchmark portfolio. This benchmark can be given, for example, bya stock index.The investor chooses to maximise expected utility from the ratio of his portfolio andthe benchmark. The problem can then be solved using a stochastic control approachor a martingale approach. We will present the latter one.

Start time 12:20

MEASURING VEGA RISKS OF BERMUDAN SWAPTIONS

UNDER THE MARKOV-FUNCTIONAL MODELDuy Pham and Dr. Joanne E Kennedy

Department of Statistics, University of Warwick, UK

Keywords: Markov-Functional, Bermudan swaption, Hedging, vega risks

Markov-Functional (MF) models form a popular class of models in which the valueof pure discount bonds can be expressed as a functional of some (low-dimensional)Markov process. We shall consider a particular application of MF model, pricing andhedging the Bermudan swaptions which are by far the most common in the interestrate derivatives market. Practically, calculation of risk sensitivities for a Bermudanswaption is as important as calculation of its value. In this work, we consider dif-ferent parametrizations of the driving Markov process and their implications on theBermudan swaption’s vega risks.

49

Page 50: 33rd Research Students' Conference in Probability and Statistics

Start time 12:45

MATHEMATICAL AND STATISTICAL MODELS FOR

PREDICTING FINANCIAL BEHAVIOURGolnaz Shahtahmassebi

University of Plymouth, UK

Keywords: Ultra high frequency financial data, Poisson difference distribution,decomposition, Bayes, Markov chain Monte Carlo

In this study we introduce the application of the Poisson difference (PD) distributionto ultra high frequency financial data. To investigate the behaviour of index change,PD models were implemented in a Bayesian framework via the Markov chain MonteCarlo (MCMC) methods. In order to capture the excess of zero counts in the data, thezero-inflated distribution is used. In addition, a decomposition (ADS) model, whichdecomposes an index change into three components: index activity, direction andsize of the index change, was also considered using the Bayesian approach. Both ofthe models predicted the index change with a reasonable degree of accuracy. How-ever, the PD model might be easier and less time consuming to implement in onlineapplications, e.g. making predictions. The Gelman convergence diagnostics showeda good convergence of the chains in the case of both the ADS and PD models.

Start time 13:10

AN OPTIMAL STOPPING PROBLEM OF FINITE HORIZON

WITH REGIME SWITCHINGChun Wang

School of Mathematical Sciences, The University of Nottingham, UK

Keywords: optimal stopping, regime switching, supermartingale

We study a class of finite-horizon optimal stopping problems under regime switch-ing models by considering a series of optimal stopping problems and its limit. Theapplication of this problem includes the pricing of American put options where thestock price evolves as a regime switching geometric Brownian motion. The construc-tion involved will naturally lead to a computational procedure for which a numericalexample also is provided.

50

Page 51: 33rd Research Students' Conference in Probability and Statistics

12.1.7 Session 2c: Elicitation and Epidemiology

Session Room: MS.05Chair: Michelle Stanton

Start time 11:30

ON ELICITING EXPERT OPINION IN GENERALIZED

LINEAR MODELSFadlalla G. Elfadaly and Prof. Paul H. Garthwaite

The Open University, UK

Keywords: Elicitation Methods, Prior Distributions, Generalized Linear Models,Interactive Graphical Software

An important assessment task in Bayesian analysis of generalized linear models (GLMs)is to specify an informative prior distribution for the model parameters. Suitable elic-itation methods play a key role in this specification by obtaining and including expertknowledge as a prior distribution.An elicitation method of quantifying opinion about any GLM was developed inGarthwaite and Al-Awadhi (2006). The relationship between each continuous pre-dictor and the dependant variable (assuming all other variables are held fixed) wasmodeled as a piecewise-linear relation. The regression coefficients of this relationwere assumed to have a multivariate normal distribution. However, a simplifying as-sumption was made regarding independence between these coefficients, in the sensethat regression coefficients were a priori independent if associated with different pre-dictors.In this current research we relax the independence assumption between coefficientsof different variables. This will significantly increase the range of situations wherethe method is useful, but it means that the variance-covariance matrix of the priordistribution is not necessarily block-diagonal. A method of elicitation for this morecomplex case is given and it is shown that the resulting variance-covariance matrixis positive-definite.The method was designed to be used with the aid of interactive graphical software,which is being revised and extended further in this research to handle the case ofGLM with correlated pairs of covariates.

Start time 11:55

DISCORDANCY BETWEEN THE PRIOR AND DATA USING

CONJUGATE PRIORSMitra Noosha

Queen Mary University of London

51

Page 52: 33rd Research Students' Conference in Probability and Statistics

In Bayesian Inference the choice of prior is very important to indicate our beliefs andknowledge. However, if these initial beliefs are not well elicited, then the data maynot conform to our expectations. The degree of discordancy between the observeddata and the proper prior is of interest. Pettit and Young (1996) suggested a BayesFactor to find the degree of discordancy. I have extended their work to further exam-ples.I try to find explanations for Bayes Factor behaviour. As an alternative I have lookedat a mixture prior consisting of the elicited prior and another with the same mean buta larger variance. The posterior weight on the more diffuse prior can be used as ameasure of the prior and data discordancy and also gives an automatic robust prior.I discuss various examples and show this new measure is well correlated with theBayes factor approach.

Start time 12:20

INDIAN BUFFET EPIDEMICS

A BAYESIAN APPROACH TO MODELLING

HETEROGENEITYAshley P. Ford and Gareth O. Roberts

University of Warwick, UK

Keywords: Epidemic, MCMC

The application of mathematical and computer models to the prediction of epidemicsin real time is often lacking the crucial stage of statistical inference. There is a needfor techniques of inference on models which lie between the extremes of over simpli-fication and too complex for inference.The Indian Buffet Epidemic model has been developed to address the need for amodel which is more suitable than assuming homogeneous mixing or an incorrectnetwork model. The aim is to have a process which fits the heterogeneity and two orthree parameters that measure the departure from homogeneity.The Indian Buffet Epidemic combines a bipartite network model with the Indian Buf-fet process to provide a realistic model which is simple to define and simulate from.The model assumes that there are a large number of potential classes, individualsbelong to a subset of these classes. The classes might be households, schools, clubs,etcetera, an important feature of this new class of models is that the classes do notneed to be specified. Within each class infection occurs homogeneously and recoveryis as in the basic SIS or SIR model.The model is descibed along with an MCMC algorithm for deriving parameter esti-mates. An important aspect is the development of a new proposal distribution forlarge binary matrices. The algorithm is demonstrated on a range of simulated datafrom both the true model and other epidemic models and comparisons made be-tween centered and non-centered representations for the augmented data.

52

Page 53: 33rd Research Students' Conference in Probability and Statistics

Start time 12:45

A HIDDEN MARKOV MODEL TO ANALYSE MRSATRANSMISSION IN HOSPITAL WARDS

Colin WorbyUniversity of Nottingham, UK

Keywords: hidden Markov model, epidemiological model, MRSA

Methicillin-resistant Staphylococcus aureus (MRSA) remains a problem in healthcareinstitutions in the UK and worldwide, causing serious, sometimes life-threatening,infections with limited treatment options. For this reason there is much emphasis onthe prevention of transmission, for example through the isolation of known cases inside rooms or cohorts, and the use of contact precautions such as disposable gownsand gloves. However, there is still much debate over the efficacy of individual controlmeasures. We use hospital data collected from a selection of general medical wards,and create a model describing MRSA transmission dynamics amongst patients, withthe aim of estimating the effectiveness of hospital infection control strategies. A hid-den Markov model is used to describe the indirectly observed MRSA transmissionprocess, accounting for the fact that screening to detect MRSA presence is not 100%accurate. This framework allows us to analyse how the probability of a patient ac-quiring MRSA is related to ward prevalence, and how effective isolation and de-colonisation measures are in reducing transmission. The study confirms a reductionin transmission due to the combined effect of isolation and decolonisation treatment.While side room isolation is widely used in controlling the spread of nosocomialpathogens, we found no evidence to suggest physical isolation, through moving pa-tients to a single room, significantly reduces transmission potential in comparison toisolation methods on the open ward.

Start time 13:10

ESTIMATING THE SIZE OF A BADGER POPULATION USING

LIVE CAPTURE AND POST-MORTEM DATANeil Walker1, Dez Delahay1 and Prof Peter Green2

1 Fera, Woodchester Park, Stonehouse, Glos.2 Maths Dept, University of Bristol

Keywords: Bayesian, mark-recapture, autocorrelation, population size

Woodchester Park in Gloucestershire has been the site of an intensive mark-recapturestudy on a local badger population study since 1975. We consider methods of popula-tion size estimation using these data supplemented by information on post-mortem

53

Page 54: 33rd Research Students' Conference in Probability and Statistics

recoveries. Of particular relevance is the integrated approach advocated by Catch-pole et al (1998) - this is applied in a Bayesian context. In addition, we look at idiosyn-cracies in the data and possible extensions therein, for example temporal autocorre-lation in the capture, survival and recovery parameters. Finally, the performance ofdifferent models is considered and we discuss possible reasons for these differences.

12.1.8 Session 2d: Multivariate Statistics

Session Room: A1.01Chair: Nathan Huntley

Start time 11:30

CAUCHY PRINCIPAL COMPONENTS ANALYSISAisha Fayomi and Prof. Andy Wood

University of Nottingham, UK

Keywords: Principal Components Analysis, robust statistical techniques, Cauchy likelihood

Robust methods are highly relevant in multivariate statistical analysis. Many dif-ferent robust methods have been developed to cover the needs of numerous otherfields. Principal components analysis (PCA) is considered as one of the most impor-tant techniques in statistics. However, it depends on either a covariance or a correla-tion matrix, which are both very sensitive to outliers. From this point of view, it wasour thought to develop an alternative method to classical PCA, which is more robust,by using the Cauchy likelihood function to construct a robust principal componentsprocedure.

Start time 11:55

APPROXIMATE JOINT STATISTICAL INFERENCE FOR

LARGE SPATIAL DATASETSJames Sweeney and John Haslett

Trinity College Dublin, Ireland

Keywords: Multivariate nonparametric regression, Palaeoclimate reconstruction, Inverseproblems

We propose an approximate sequential approach for inferring the correlation matrixin large multivariate spatial regression problems. This enables the decomposition ofthe computationally intensive, multivariate, ”joint” problem, into a set of indepen-dent univariate problems with possible correlation structure inferred sequentially.Omission of correlation structure (where inappropriate) in potential models will lead

54

Page 55: 33rd Research Students' Conference in Probability and Statistics

to increased uncertainty in the degree of confidence at the reconstruction stage of anassociated inverse problem.

The results from the proposed sequential approach are compared to those obtainedusing the (correct) full joint approach through the comparison of bias and predictiveproperties for simulated and palaeoclimate data. Inference procedures used are Em-pirical Bayes (EB) based where the hyperparameters governing a given model areconsidered as unknown fixed constants.

Start time 12:20

MULTIVARIATE OUTLIERS, THE FORWARD SEARCH AND

THE CRONBACH’S RELIABILITY COEFFICIENTMichael Tsagris

University of Nottingham, UK

Keywords: multivariate outliers, Forward search, Cronbach’s alpha

The multivariate outliers are of very interest due to the nature of the data. Whilein the univariate case, things are straightforward, when moving to more than onevariables things can be very difficult. In this work, multivariate outlier detectionmethods are discussed and the Forward search is also implemented. The robust es-timates of scatter and location is the key feature for the detection of outliers. Finally,the Cronbachs reliability coefficient is discussed and applied to the Forward searchas a monitoring statistic.

Start time 12:45

BAYESIAN ANALYSIS IN MULTIVARIATE DATARofizah Mohammad and Dr. Karen Young

University of Surrey, UK

Keywords: Model choice, Bayes factors, Classification, Discriminant analysis, Influentialobservations

In this presentation we will be considering a Bayesian approach to model selectionin multivariate normal data using the Bayes factor, similar to that used by Spiegel-halter and Smith (1982). We are particularly interested in classifying observations,when we know that they come from different populations. We shall compare clas-sical techniques of linear and quadratic discriminant functions with a new Bayesianapproach. We are interested in looking at the effect of observations on this classifica-tion. One diagnostic to determine the effect of observations on a Bayes factor is kd,which is used to assess the effect of individual observations on model choice, Pettitand Young (1990).

55

Page 56: 33rd Research Students' Conference in Probability and Statistics

Start time 13:10

SOME ASPECTS OF COMPOSITIONAL DATAFiona Sammut

University of Warwick, UK

Keywords: Multivariate Constrained Data

A composition X is a D-vector, whose components X1, . . . , XD satisfy a sum con-straint, that is, X1 + . . . + XD = c, where c may be equal to 1, 100, 106 or any otherconstant, depending on unit of measurement. Due to its nature, compositional dataconveys only relative information, the elements are always zero or positive and onepart of the composition may always be written in terms of the remaining parts. Datais thus not free to range as unconstrained variables encountered in traditional multi-variate analyses. This fact conditions the variance covariance structure in that at leastone covariance is forced to be negative. In general, analyzing compositional datawith methods which are based on the variance covariance or correlation structurelike factor analysis, discriminant analysis and principal component analysis wouldlead to incorrect results. It was thus necessary to find some parametric class of dis-tributions which could cater for the dependence structure between the parts of thecompositions but which could also make the transition from the simplex (the spaceof compositional data) to the whole real line possible. A possible approach to such asituation is based on logratio transformations which provide a one to one mappingfrom the simplex to the real space, removing the problem of having to work withina constrained sample space. Such a transformation then makes it possible to applythe standard multivariate techniques on the transformed compositional data. A ma-jor shortcoming which is common to all logratio transformations, however, is that ifsome parts of a composition are zero, the corresponding logratios may not be com-puted. Different strategies had to be developed in attempt to deal with this problem.

12.1.9 Session 3a: Genetics

Session Room: MS.01Chair: Dennis Prangle

Start time 14:30

INCORPORATING AVAILABLE BIOLOGICAL KNOWLEDGE

TO EXPLORE GENOME-WIDE ASSOCIATION DATAMarina Evangelou

MRC-Biostatistics Unit, University of Cambridge, UK

Keywords: Genome-wide association studies, Pathway-based analysis

56

Page 57: 33rd Research Students' Conference in Probability and Statistics

The evolution of the science of genetics and the development of genotyping technolo-gies have made genome-wide association studies (GWAS) feasible. GWAS have beensuccessful in identifying SNPs that are significantly associated with various complexdiseases, but they do not have the required power to detect small effects of SNPs thatare known to be biologically associated with the disease. Our research focuses on theexploration of genome-wide association data using pathway-based analysis. Pathway-based analysis is a joint test of association between a group of SNPs/ genes within aknown biological pathway and the outcome (which can be either a binary responsevariable or a continuous one). Pathway-based approaches have the advantage of in-corporating the available biological knowledge of SNPs and genes and therefore havea better chance of identifying the true model of association.Our genome-wide association study aims to identify the relationship between ge-netic loci and platelet function. Platelets, which play an important role in thrombusformation, are rapidly activated by a range of agonists like collagen and ADP. Thisstudy involves a cohort of 500 healthy individuals for each of whom four endpointswere measured: fibrinogen and p-selectin responses to ADP and collagen agonistsin order for platelet function to be determined. It is believed that a large number ofgenes with small effects is associated with platelet function and we are aiming to findthis by implementing approaches to pathway analysis.

Start time 14:55

INFORMED BAYESIAN CLUSTERING OF GENE EXPRESSION

LEVELSAnna Fowler

Imperial College London, UK

Keywords: Bayesian Hierarchical Clustering, Variable Selection, Gene Expression Levels

Single Nucleotide Polymorphisms (SNPs) occur when there is a variation in the DNAsequence at one of the nucleotide bases. This can cause differences in the proteinsproduced and therefore alter the actions of the cell. HLA-DQA proteins play an es-sential role in the immune system by presenting antigens to a specific group of whiteblood cells (T cells) to enable them to produce the antibodies needed. The data weare analysing are part of the HapMap project and consist of genotype labels for threeSNPs which cause the 116 subjects to produce different amounts of the HLA-DQAprotein. There are also gene expression levels for each subject, which indicate thelevel of production for the proteins associated with each gene. It is the immune sys-tem which is primarily of interest here, and of the 3538 measured genes very fewproduce proteins which are related to immunity. Identifying the significant genesis complicated by the dimensionality of the data and has been approached in manyways recently.Two-way Bayesian hierarchical clustering allows clusters to form over both genesand subjects, revealing the underlying block-like structure of the data. Genes which

57

Page 58: 33rd Research Students' Conference in Probability and Statistics

are related to the immune system are more likely to be co-regulated with the SNPgenotypes than those which are not. Therefore, the clustering of the subjects andtheir genotypes will influence the clustering of the genes which are related to the im-mune system significantly more than the clustering of those which are not. Hence,by applying a novel method of two-way clustering only over the genes which ben-efit significantly from this additional information, we seek to determine which geneclusters are co-regulated with the production of the HLA-DQA proteins and identifythese genes as the variables associated with the immune system.

Start time 15:20

AN APPLICATION OF BAYESIAN TECHNIQUES FOR

MENDELIAN RANDOMIZATION TO ASSESS CAUSALITY IN

A LARGE META-ANALYSISStephen Burgess and Simon G. Thompson

MRC Biostatistics Unit, University of Cambridge

Keywords: Genetic epidemiology, Mendelian randomization, Causality, Meta-analysis,Bayesian methods

The determination of causality from observational data is historically a controver-sial question. Observational relationships between a risk factor and an outcome areaffected by confounding and reverse causation. Mendelian randomization is a tech-nique whereby genetic information is used analogously to randomization in a ran-domized control trial. Under certain assumptions, genetic information can give in-sight to the nature and direction of a causal association. Genetic variation in a riskfactor is determined at birth, so is causally prior to any event, and is allocated ran-domly in population groups, meaning that subgroups differing in genetic variantswith a specific effect on the risk factor of interest will not systematically differ in otherfactors. We show how novel Bayesian techniques can be applied to a large dataset,comprising over 100 000 participants in over 30 different studies measuring over 20different genetic variants, to assess the causal association of C-reactive protein oncoronary heart disease.

58

Page 59: 33rd Research Students' Conference in Probability and Statistics

Start time 15:45

BAYESPEAK: A HIDDEN MARKOV MODEL FOR

ANALYSING CHIP-SEQ EXPERIMENTSJonathan Cairns1, Christiana Spyrou4, Andy Lynch1, Rory Stark3 and Simon Tavare1

1 Department of Oncology, University of Cambridge, Li Ka Shing Centre,Cambridge, UK

2 DAMTP, Centre for Mathematical Sciences, Wilberforce Road, Cambridge, UK3 Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre,

Cambridge, UK4 MRC Clinical Sciences Centre, Faculty of Medicine, Imperial College London,

UKKeywords: Bayesian Inference, Hidden Markov Model, Gibbs Sampling,

Metropolis-Hastings, Negative Binomial, Oncology, ChIP-seq

Accurate identification of interactions between proteins and DNA is a key element inunderstanding the mechanisms that lead to cancer. The biological experiment ”ChIP-seq” is used to investigate sites on the chromosome where proteins bind, often acti-vating or silencing a particular gene.The data presents itself as ”peaks” across the chromosome. However, various techni-cal or biological effects can lead to noise, disguising true peaks and even generatingfalse peaks.Hidden Markov Models (HMMs) have applications in this biological setting. We canuse the hidden state to indicate a binding site, and choose a model that reflects theexpected biological features of the signal.”BayesPeak” is an MCMC algorithm we have developed to solve this problem, usinga Bayesian approach and based on negative binomial emissions. I will be discussingthe statistical issues we face when fitting our theoretical model to large data sets.

12.1.10 Session 3b: Medical Statistics II

Session Room: MS.04Chair: Helen Thornewell

Start time 14:30

DESIGNING A SERIES OF PHASE II TRIALSSiew Wan Hee

Warwick Medical School, University of Warwick, UK

59

Page 60: 33rd Research Students' Conference in Probability and Statistics

In some diseases with very small population, the number of patients eligible for clini-cal trial is limited. When the development of new therapies increases relatively fasterthan the recruitment of patients there is a need to identify a promising treatment asquickly as possible. A design that requires fewer patients will require less time toidentify a treatment for further testing in phase III trial. Some authors (Whitehead,1985 and Yao et al, 1996) have proposed considering a series of clinical trials whereeach trial tests a treatment that is different from the others. There is a trade-off be-tween large trials which require many patients and small trials which may yield littleinformation, particularly if there is a high start-up cost. We propose a design thatis a hybrid of classical frequentist and Bayesian where the traditional analysis at theend of the trial is based on the conventional frequentist hypothesis testing and theBayesian method is used to maximize the power of the series of trials. Designs areobtained optimise the number of patients and power for each trial in a series. Thetotal number of patients eligible for trial and the type I error (which is declaring thetreatment as effective when it is not) are fixed and a start-up cost is included.

Start time 14:55

RESPONSE-ADAPTIVE BLOCK RANDOMIZATION IN

BINARY ENDPOINT CLINICAL TRIALSDominic Magirr

Lancaster University, UK

Keywords: Clinical trials, Adaptive design

The results of a clinical trial will typically accumulate steadily throughout its dura-tion. Response-adaptive randomization (RAR) uses the accumulating data in orderto skew the randomization of remaining patients to treatment groups in favour ofthe current better performing treatment. The aim is to reduce the number of patientsreceiving inferior treatment. RAR has rarely been used in practice. One example is atrial of extra-corporeal membrane oxygenation (ECMO) to treat newborn infants withrespiratory failure. The results of the trial were controversial in large part becauseonly one patient received control therapy. In this talk the ECMO trial is described.Alternative RAR designs are proposed that incorporate random permuted blocks inorder to eliminate the possibility of such an extremely unequal allocation ratio.

Start time 15:20

BAYESIAN CLINICAL TRIAL DESIGNS FOR SURVIVAL

OUTCOMESShijie Ren

University of Sheffield, UK

Keywords: Assurance, Survival outcome

60

Page 61: 33rd Research Students' Conference in Probability and Statistics

When designing a clinical trial, sponsors or decision-makers may only consider thepower of the trial, i.e. the conditional probability of a successful trial assuming aspecified treatment effect. Since the treatment effect is uncertain, this will not providea reliable assessment of the probability of a successful outcome and can often give amisleading impression of the likely outcome of the trial. As an alternative to usingpower, one can consider the unconditional probability of a successful trial outcomeknown as assurance. This involves utilizing prior information about treatment effectsin the design of the trial. We consider how to derive assurance when a trials outcomemeasure is survival time. We allow for uncertainty in both treatment effect and thecontrol group survival function.

Start time 15:45

THE POWER OF THE BIASED COIN DESIGN FOR CLINICAL

TRIALSWai Yin Yeung

Queen Mary, University of London, UK

Keywords: biased coin design, clinical trials, sequential patient allocation

The biased coin design introduced by Efron (1971, Biometrika) is a design for allocat-ing patients in clinical trials which helps to maintain the balance and randomness ofthe experiment. Chen (2006, Journal of Statistical Planning and Inference) studied thepower of repeated simple random sampling and the biased coin design in which thepower is treated as the conditional probability of correctly detecting a treatment effectgiven the current numbers of patients on the two treatments, the control group andthe treatment group. The variances of the responses for the two groups are assumedto be equal. The z test and the t test for a treatment effect are used to demonstrateand analyse the power function when the variances of the treatment responses areknown and unknown, respectively. Numerical results given in his paper showedthat the biased coin design is uniformly more powerful than repeated simple ran-dom sampling.In this talk, I shall report on my current work which extends Chen’s on the power tothe case where the variances of the responses for the two treatments are assumed tobe different. I will give numerical results for the powers of repeated simple randomsampling and the biased coin design when the variances are known and different;and also when they are unknown and different.

61

Page 62: 33rd Research Students' Conference in Probability and Statistics

12.1.11 Session 3c: Dimension Reduction

Session Room: MS.05Chair: James Sweeney

Start time 14:30

ORACLE PROPERTIES OF LASSO-TYPE METHODS IN

REGRESSION PROBLEMSSohail Chand

School of Mathematical Sciences, University of Nottingham, UK

Keywords: Variable Selection, Lasso, LARS, Oracle properties.

In model building, we often have a large set of predictors. As all the variables arenot equally important for the model, we seek a parsimonious model. Parsimoniousmodels are very important for prediction purposes as overfitted models have higherprediction variance. In practice, it is often quite difficult to find a model which isa good fit as well as easy to interpret. As discussed by Fan and Li (2001, JASA96(456):1348-1360), a good estimation procedure should have the oracle properties,namely variable selection consistency and the optimal estimation rate. Lasso-typemethods in the regression context are popular for their simultaneous estimation andvariable selection. Our numerical results show in some scenarios how normalisationof the predictors can nullify the advantage of using the adaptive weights and maylead to failure of the necessary and sufficient condition for correct subset selection.The choice of the regularisation parameter is critical for the oracle performance ofthese methods. We have compared the performance of cross validation with theWang and Leng (2009, J Roy Stat Soc B Met; 71(3):671-683) BIC approach in choosingthe appropriate value of regularisation parameter. Our results show that the crossvalidation choice of regularisation parameter may lead to inconsistent variable selec-tion.

Start time 14:55

PENALIZED WEIGHTED LEAST SQUARES VARIABLE

SELECTION METHOD FOR AFT MODELS WITH HIGH

DIMENSIONAL COVARIATESMd. Hasinur Rahaman Khan and J. Ewart H. Shaw

University of Warwick, UK

Keywords: AFT model, Penalized Regression, Variable Selection, Weighted Least Squares

62

Page 63: 33rd Research Students' Conference in Probability and Statistics

Although, in recent years penalized regression methods have received a great deal ofattention for simultaneous variable selection and coefficient estimation particularlyin the analysis of high-dimensional datasets, only small number of methods basedon penalized approaches have been suggested for survival datasets. Here we lookat a new penalized approach, based on weighted least squares, for model estima-tion and variable selection in parametric accelerated failure time (AFT) models. Weapplied this approach for Log-Normal AFT model with both low-dimensional andhigh-dimensional datasets. This approach improves predictive accuracy which is animportant inferential goal in survival analysis while dealing with variable selectiontechniques. The performance of this approach is demonstrated with simulated exam-ples and real datasets where time to survival, in the presence of right censoring, is ofinterest.

Start time 15:20

LATENT VARIABLE MODELS FOR PROCESS MONITORINGJavier Serradilla and Dr. Jian Q. Shi

Newcastle University, UK

Keywords: Multivariate Statistical Process Control, Latent Variable Models, ProbabilisticPCA

Fault detection and diagnosis in manufacturing process are a key aspect in currentgood engineering practice. Statistical approaches to fault detection based on histori-cal operating data have been found to be advantageous with processes having a largenumber of measured variables. These models, however, tend to underperform in thearea of fault diagnosis, where the variable(s) responsible for the plant abnormal be-haviour must be identified.In this presentation we intend to review how latent variable models can be used bothto reduce the data dimensionality and form subgroups of variables. These new vari-ables are then used for process monitoring. The added advantage of the approach isthat each latent variable will be selectively looking at a specific and well defined sub-set of the original variables. Likewise, fault detection is quicker as the confoundingeffect of redundant variables is eliminated.

Start time 15:45

A STUDY OF ITEM SELECTION USING PRINCIPAL

COMPONENT ANALYSIS AND CORRESPONDENCE

ANALYSISNur Fatihah Mat Yusoff

National University of Ireland, Galway

Keywords: item selection, principal component analysis, correspondence analysis

63

Page 64: 33rd Research Students' Conference in Probability and Statistics

This study investigates the dimension-reduction techniques in psychometric testingby using Principal Component Analysis (PCA) and Correspondence Analysis (CA).Psychometric research is one of the fields of social science study that is interested inthe theory and techniques of education and psychological measurement. Researchersin this area are frequently concerned with the construction and validation of measure-ment instruments. Theoretically, PCA is a mathematical algorithm that transforms anumber of possibly correlated variables into a smaller number of uncorrelated vari-ables by performing a covariance analysis between variables. The PCA concept isclosely related to Factor Analysis (FA) which aims to detect structure in the rela-tionships between variables. It is a common technique that has been used by socialscience researcher in conducting validity and reliability analysis of their study. TheCA can be considered as a factor method for the categorical variables and is oftenlinked with producing a low-dimensional graphical display of variables and units.Simple CA is a technique designed to analyse a two-way table, while Multiple Cor-respondence Analysis (MCA) is an extension of simple CA in that it is applicable toa large set of variables. The result will provide information which is similar in na-ture to those produced by principal component analysis, and allows us to explore thestructure of the categorical variables included in the table.This study is concerned with reducing the dimension, or number of variables, in aninstrument by using the data from a pilot study on personality traits. The originalinstrument was developed by Oliver P. John and Sanjay Srivastava from Universityof California, Barkeley in 1999. The pilot survey was conducted at the UniversityMalaysia Sarawak, Malaysia where 80 students from second year and above wererandomly selected as respondents. In the original instrument, there are 44 items toassess five personality traits or the big five dimensions. We believe that some of theitems, or even dimensions are not relevant in the Malaysian context. At the end ofthis study, our aim is to produce the best instrument that can represent all of the vari-ables that we are interested in for subsequent use in structural equation modelling ofstudent achievement.

64

Page 65: 33rd Research Students' Conference in Probability and Statistics

12.1.12 Session 3d: Environmental

Session Room: A1.01Chair: Andrew Smith

Start time 14:30

USING A BAYESIAN HIERARCHICAL MODEL FOR

TREE-RING DATINGEmma M. Jones1, Caitlin E. Buck1, Clifford D. Litton2, Cathy Tyers1 and Alex

Bayliss3

1 University of Sheffield, UK2 University of Nottingham, UK

3 English Heritage, UKKeywords: Dendrochronology, Bayesian hierarchical modelling

Dendrochronology, or tree-ring dating, uses the annual growth of tree-rings to datetimber samples. Variation in ring width is determined by variation in the climate.Trees within the same geographical region are exposed to the same climatic signal ineach year, but the signal differs from year to year.Dendrochronologists measure sequences of tree-ring widths with a view to datingsamples by matching undated sequences to dated sequences known as ‘master’ chronolo-gies. The tree-ring widths from undated timbers are measured and the data are pro-cessed to remove growth trend. The processed data are sequentially matched againstone another, each match position is known as an offset; initially matching timbersfrom the same site or woodland and then matching average sequences from each siteor woodland, known as ‘site’ chronologies, to master chronologies.The hierarchical nature of the data leads to modelling the data using a Bayesian hi-erarchical model. The ring-width for tree j in year i is modelled as the sum of theclimatic signal in year i and a random noise which is particular to a tree j in year i.This model can be extended to include climatic signals at varying geographic scales.A Gibbs sampler is used to produce posterior probabilities for a match at each offset.This methodology relies on careful prior specification of parameters at each level ofthe hierarchy. Data are currently being collated from trees of known age from severalwoods in the UK that will be used to provide informative prior knowledge.

Start time 14:55

NOT ANOTHER SPECIES RICHNESS ESTIMATOR?!Beth Norris

University of Kent, UK

Keywords: Statistical ecology, Species richness estimation

65

Page 66: 33rd Research Students' Conference in Probability and Statistics

One of the oldest and most intuitive measures of biodiversity is species richness,which is simply the number of species present in an area of study. Sampling frompopulations will rarely give a complete inventory of species and therefore severalmethods have been developed in order to estimate the true species richness of a pop-ulation from sample data. There are over 20 different techniques already describedthat will produce an estimate of total species richness, so why do we need another?Species richness estimators often perform badly for benthic data sets. Some stud-ies have suggested that species richness estimation is dependent on spatial patterns,and that the clustered spatial distribution of benthic assemblages hampers incidencebased estimators such as Chao2 and ICE. None of the commonly used species rich-ness estimators considered take into account spatial heterogeneity, and non-parametricestimators often underestimate the total species richness for such data sets.Therefore, an alternative approach has been proposed which relies on modelling theunderlying spatial pattern of individual species. The modelling framework consid-ered is based on the method of maximum likelihood, and fits a parametric modelto observed species abundances. As species heterogeneity factors will be taken intoaccount alongside species abundances, this method should perform well in estimat-ing the true species richness of an area. The method will be assessed by simulation,and will be applied to benthic data sets supplied by Cefas. The estimates will becompared to the results from some established estimators.

Start time 15:20

UNCERTAINTY ANALYSIS FOR MULTIPLE ECOSYSTEM

MODELS USING BAYESIAN EMULATORSRachel Oxlade, Prof. Michael Goldstein and Dr. Peter Craig

University of Durham, UK

Keywords: Bayesian, Bayes Linear, simulator, ecosystem, model, emulation

Bayesian emulation provides a tool for analysing complex simulators. When thereare many parameters over a large input space, and model runs are costly, emulationenables us to approximate the simulator across the space, and gives a measure of ouruncertainty at each point.This talk introduces emulation and then investigates how it can be applied to HadOCC,the Hadley Centre Ocean Carbon Cycle model. The goal of the project is to be able tojointly emulate two simulators of the same system, and this idea will be introducedin the talk.

66

Page 67: 33rd Research Students' Conference in Probability and Statistics

Start time 15:45

ESTIMATING BIOLOGICALLY PLAUSIBLE RELATIONSHIPS

BETWEEN AIR POLLUTION AND HEALTHHelen Powell, Duncan Lee and Adrian BowmanDepartment of Statistics, University of Glasgow, UK

Keywords: Air pollution, Monotonic dose-response relationship, Respiratory health

The effects of air pollution on human health can be estimated using ecological time-series studies, which comprise daily data for the population living within an urbanarea. The responses are daily counts of mortality or morbidity outcomes, which arerelated to air pollution concentrations and other covariates. The majority of studiesestimate a linear relationship between pollution (xt) and health, although a numberhave estimated non-linear dose-response curves g(xt). However, these curves aretypically unconstrained and estimated using smoothing or penalised splines, mean-ing that non-biologically plausible results can occur. For example, for some levelsof pollution the estimated health effects may decrease for increasing concentrations.Therefore, we propose a method for estimating biologically plausible dose-responsecurves, which must satisfy the following properties: (i) increasing monotonicity; (ii)smoothness; and (iii) g(0) = 0, which together enforce the dose-response curve to benon-negative.We applied this approach to data from Glasgow, using counts of respiratory relatedhospital admissions and ozone concentrations. We compared our model with onethat incorporates an unconstrained curve, and found that the latter produced un-realistic results, as the relative risk falls below one and there was a decreasing riskof hospital admissions at high concentrations of ozone. In contrast, the constrainedcurve does not give a relative risk below one for any concentration of ozone, andtherefore does not imply it could be beneficial to your health. This curve was alsobiologically plausible, because increasing ozone concentrations result in increasinghealth risks.

67

Page 68: 33rd Research Students' Conference in Probability and Statistics

12.2 Wednesday 14th April

12.2.1 Session 4a: Medical Statistics III

Session Room: MS.01Chair: Fiona McElduff

Start time 09:10

AN APPLICATION OF SURVIVAL TREES TO THE STUDY OF

CARDIOVASCULAR DISEASEAlberto Alvarez Iglesias1, John Newell2 and Liam Glynn3

1 School of Mathematics, Statistics and Applied Mathematics, NUI, Galway,Ireland.

2 Clinical Research Facility, NUI, Galway, Ireland.3 Department of General Practice, NUI, Galway, Ireland.

Keywords: Recursive partitioning, Survival Trees, Random Survival Forest

Recursive partitioning methods are a popular non-parametric alternative to the clas-sical parametric and non-parametric models in regression, classification and survivalproblems. They have been recognised as a useful modelling tool as they produce amodel that is very easy to interpret. The beauty of these methods lies in their sim-plicity and the relative ease in which the results of the analysis can be explained to aperson with a non statistical background. Single trees are an excellent way to describethe structure of the learning data but their predictive power can be disappointing. Inthe last decade, many efforts have been made to overcome this problem. These meth-ods are generally known as ”ensemble methods” and they use a set of trees, createdby bootstrapping the original data, in order to improve predictibility. The price tobe paid, however, is the absence of a singular tree. In this work, a data set of 1586patients with cardiovascular disease will be analyzed. The primary endpoint wasa cardiovascular composite endpoint, which included death from a cardiovascularcause or any of the cardiovascular events of myocardial infarction (MI), heart failure,peripheral vascular disease and stroke. Seventeen factors/covariates will be consid-ered for development of a prognostic model and the results of different methods forgrowing survival trees will be compared.

Start time 09:35

ANALYSIS OF AN OBSERVATIONAL STUDY TO IN

COLORECTAL CANCER PATIENTSCara Dooley1, John Hinde1 and John Newell2

1 National University of Ireland, Galway2 Clinical Research Facility, National University of Ireland, Galway

68

Page 69: 33rd Research Students' Conference in Probability and Statistics

The aim of the study was to compare survival of colorectal cancer patients in thewhole population against the survival of patients in a sub-population who also hadinflammatory bowel disease (IBD). All individuals who suffered from colorectal can-cer were drawn from the entire Irish population using data from January 1994 toDecember 2005 provided by the National Cancer Registry of Ireland (NCRI).The control group contained many more observations (n > 20000) when comparedto the IBD group (n = 170). Given the number of control patients, there was largediversity in this group. In a conventional designed experiment or trial, patients en-tering the trial would be taken to be as similar as possible. Usually patients wouldbe similar in age, health etc. As this was an observational study, there was no designprior to collecting the data.To compensate for this lack of design, each IBD patient is matched to the ”closest”control patient. For each pair of IBD and control patients a distance is calculated andthose two patients which have the smallest distance between them (and are so are themost similar) are matched. The distance used in this case is a Malanhobis distancebased on ranks. The matching is carried out using the Optmatch Package in R.

Start time 10:00

CAUSAL INFERENCE IN LONGITUDINAL DATA ANALYSIS:A CASE STUDY IN THE EPIDEMIOLOGY OF PSORIATIC

ARTHRITISAidan O’Keeffe

University of Cambridge, UK

Keywords: Causality, Multi-state model, Local Dependence and Independence, PsoriaticArthritis

In any setting when there exists a causal link between two processes or events, thecause must precede its effect. Hence, it seems plausible that a model which aims touncover a causal relationship should account for the passage of time between causeand effect. Longitudinal data are characterised by repeated measurements beingtaken over time on units/subjects, and in this longitudinal setting it appears natu-ral to consider causality. Multi-state models offer a way of describing changes inlongitudinal data over continuous time and it is through the use of such models, inconjunction with important causal concepts, such as composability, local dependenceand local independence and the Bradford Hill criteria, that we shall attempt to infercausality. We use data on the progression to clinical damage in the hand joints ofpatients suffering from the disease psoriatic arthritis (PsA), under observation at theUniversity of Toronto PsA Clinic, in an effort to demonstrate our approach to causalinference. Specifically, we examine the possibility of a causal link between diseaseactivity and clinical damage at the individual joint level.

69

Page 70: 33rd Research Students' Conference in Probability and Statistics

Start time 10:25

DESIGN AND ANALYSIS OF DOSE ESCALATION TRIALSMaria Roopa Thomas

Queen Mary University of London

Keywords: Dose escalation, Cohort effects, Bayesian methods

My research work is motivated by (Senn.et al(2007)). The Royal Statistical Societyestablished an expert group of its own to look into the details of the statistical issuesthat might be relevant to the Phase I First-in-Man TeGenero trial published in theJournal of the Royal Statistical Society Series A. First-in-Man studies aim to find adose for further exploration in Phase II trials and to determine the therapeutic effectsand side effects. Dose escalation trials involve giving increasing doses to differentsubjects in distinct cohorts. One of the recommendations of the RSS working partywas to consider cohort effects. Cohort effects can be influenced by many factors suchas different types of people volunteering at different times, changes in the ambientconditions, the staff running the trial, and the protocols for using subsidiary equip-ment.With reference to (Senn.et al(2007)) four designs for three escalating doses and theplacebo are taken into account. Using WinBugs the cohort effects are fitted and thedesigns are compared.The variance of the difference between the doses are computedusing the WinBugs software.Area of interest are the Bayesian approaches for the design and analysis of dose esca-lation trials which involves prior information concerning parameters of the relation-ships between dose and the risk of an adverse event as well as the desirable effects ofthe drug.There is a chance to update after every dosing period using Bayes theorem.In this talk I will discuss some of these issues.

Start time 10:50

MODELLING PARENTAL DECISIONS FOR NEWBORN

BLOODSPOT SCREENINGStuart Nicholls

Lancaster University, Lancster, UK

Keywords: latent variable, decision-making, screening, model

A national programme of newborn bloodspot screening has been in place in theUK since 1969. Recent advances have expanded the range of conditions for whichscreening is available, with a concomitant increase in the information made availableto parents. There is a lack of research, however, as to how parents make decisionsabout the newborn bloodspot screening. This paper reports the analysis of a postal

70

Page 71: 33rd Research Students' Conference in Probability and Statistics

questionnaire in order to evaluate a proposed model of parental decision-makingfor newborn bloodspot screening. Structural equation modelling was used to assessthe model which showed a good level of fit on several goodness of fit measures aswell as a non-significant χ2 value. Squared Multiple Correlations indicate that a highdegree of variance associated with parental decisional quality is accounted by it’spredictors of attitude towards screening and perceived choice, with an increase inperceived choice leading to a perceived improvement in parental decisions. Trust inthe staff conducting the screening tests was also significantly related to attitudes to-wards screening. This analysis suggests that the proposed model is appropriate. Themodel expands on existing decision-making models suggesting that decisions are af-fected by sociological factors such as perception of choice and trust in staff as wellas rational cognitive elements, such as risk and benefit analyses. This suggests thatexisting measures of parental decision-making and/or informed choice may may beimproved by incorporating these elements.

12.2.2 Session 4b: Point Processes and Spatio-temporal Statistics

Session Room: MS.04Chair: Chris Fallaize

Start time 09:10

POISSON PROCESS PARAMETER ESTIMATION FROM DATA

IN BOUNDED DOMAINPatrice Marek

University of West Bohemia, Czech Republic

Keywords: Poisson Process, Bounded domain, Parameter estimation, Exponentialdistribution, Distance-based methods

In the case where we want to estimate the parameter of the Poisson process that de-scribes some natural phenomenon like earthquakes we usually have to use only onerealization of this process, because it is quite clear that performing repetition is im-possible because these processes are in the hands of the nature. Moreover, we areusually limited by time or finance and therefore we can use only several observa-tions.The approach presented in this paper offers an alternative to the classical distance-based methods presented in the literature. Our approach is based on the estimationof two parameters, the measure of domain and the parameter of the Poisson process.Using this approach we can avoid censoring which would be problematic in the fur-ther research of the spatial Poisson process in the bounded domain.The work has been supported by the grant of Ministry of Industry and Trade of theCzech Republic MPO 2A 2TP1/051.

71

Page 72: 33rd Research Students' Conference in Probability and Statistics

Start time 09:35

A COMPARISON OF BAYESIAN SPACE-TIME MODELS FOR

OZONE CONCENTRATION LEVELSKhandoker Shuvo Bakar

School of Mathematics, University of Southampton, UK

Keywords: Space-time modelling, ozone centrations, auto-regressive model, dynamic linearmodel, Bayesian spatial prediction

Recently, there has been a surge of interest in space-time modelling of ozone con-centration levels. Well known time series modelling methods such as the dynamiclinear models (DLM) and the auto-regressive (AR) models are being used togetherwith the Bayesian spatial prediction (BSP) methods adapted for dynamic data. Asa result, the practitioners in this field often face a daunting task of selection amongthese methods. This paper presents a study comparing three approaches: the DLMapproach of Huerta et al. (2004), the BSP method as described by Le and Zidek (2006),and the AR models proposed by Sahu et al. (2007). Recent theoretical results (Dou etal., 2009) comparing the first two approaches are extended to include the AR mod-els. The results are illustrated with a realistic numerical simulation example usinginformation regarding the location of the ozone monitoring sites and observed ozoneconcentration levels in the state of New York in 2005-2006 for months June and July.The speed of computation, the availability of high-level software packages for imple-menting the methods, and the practical difficulties for using the methods for largespace-time data sets are also investigated.

Start time 10:00

MULTI-LEVEL MODELS FOR ECOLOGICAL RESPONSE

APPLICATIONSIain Proctor2, R.I. Smith1 and Prof. E.M. Scott2

1 Centre for Ecology and Hydrology, Edinburgh, UK2 University of Glasgow, UK

Keywords: Spatial processes, Multi-level models

A problem which occurs often in spatial statistics, is how to represent spatial change.Multi-level models are used for interpreting nested datasets, where various covari-ates are available at differing resolution scales. Used widely in epidemiological stud-ies, this framework is applicable for population studies. In this approach, I will modelthe population trend of carabid communities in upland sites of the United Kingdom.For these locations, environmental variables are measured at the site level; habitatof the surrounding area is defined for each transect, with repeat transect measures

72

Page 73: 33rd Research Students' Conference in Probability and Statistics

at some sites in later years. The setup of these data lends itself naturally to a multi-level model, in which various covariates can be assigned as fixed or random effects.The structure allows one to assign non-Gaussian distributions to the random effects,thereby creating more flexibility in the model.

Start time 10:25

A SPATIO-TEMPORAL MODELLING OF MENINGITIS

INCIDENCE IN SUB-SAHARAN AFRICAMichelle Stanton and Prof. Peter Diggle

School of Health and Medicine, Lancaster University, UK

Keywords: meningococcal meningitis, spatio-temporal, dynamic generalised linear models,

An area of sub-Saharan Africa, known as the meningitis belt, is frequently affectedby large-scale meningitis epidemics resulting in tens of thousands of cases, and thou-sands of deaths during epidemic years. The link between the seasonal and spatialpatterns of epidemics and the climate has long been recognised, although the mech-anisms which cause these patterns are not well understood. The Meningitis Envi-ronmental Risk Information Technologies Project (MERIT) is a collaborative projectinvolving the World Health Organization, and members of the environmental, publichealth and epidemiological communities. One of MERIT’s objectives is to use bothroutine meningitis surveillance data and information on climatic and environmen-tal conditions to develop a meningitis epidemic decision support tool. This decisionsupport tool could then be used to improve the targeting of preventative and reactivevaccine efforts.Weekly meningitis incidence data have been obtained from the Ethiopian Ministryof Health for the period October 2000 to July 2008 at district (woreda) level. Data onthe climate variables most strongly associated with meningitis incidence have beenobtained for Ethiopia over the same time period from the International Research In-stitute (IRI) at Columbia University, New York. We formulate a spatio-temporal dy-namic generalised linear model for incidence and describe how the model can be fit-ted to spatially aggregated incidence data using remotely sensed images of environ-mental and meteorological factors as explanatory variables. The aim of this project isto enable short-term forecasting of district-level incidence as part of the developmentof a country-wide meningitis decision support tool.

Start time 10:50

DENOISING UK HOUSE PRICESAndrew Smith

University of Bristol, UK

Keywords: Nonparametric regression, Penalised regression, Graphs

73

Page 74: 33rd Research Students' Conference in Probability and Statistics

The British people are obsessed with house prices. There is considerable interest inthe difference in price between different areas and in different years. This talk willattempt to show a smooth national trend in house prices, in both space and time.We will look at noisy data, provided by Halifax, on UK house prices and discuss itas a particular example of regression on a graph. There are considerable challengesin the data, most notably the lack of covariate values and missing observations, thatmake existing regression methods fail.Regression on a graph is a new technique that estimates a denoised version of obser-vations made at the vertices of a graph. It is a type of penalised regression, in whichdistance from data is penalised at all the vertices, and roughness at all the edges ofthe graph. These penalty terms present computational challenges, so we will see theresult of a new, fast algorithm for regression on a graph.

12.2.3 Session 4c: General

Session Room: MS.05Chair: Michael Tsagris

Start time 09:10

MIXTURE OF LATENT TRAIT ANALYZERSIsabella Gollini and Thomas Brendan Murphy

University College Dublin, Dublin 4, Ireland

Keywords: Binary Data Models, Latent Variable Models, Mixture Models, VariationalMethods

Latent class analysis and latent trait analysis are two of the most common latent vari-able models for categorical data. Sometimes these models are not sufficient to sum-marize the data, especially when the data comes from a heterogeneous source, thevariables are highly dependent and/or the data dimensionality is large. The mixtureof latent trait analyzers model extends latent class analysis and latent trait analysisby assuming a model for the categorical response variables that depends on both acategorical latent class and a continuous latent trait variable. Fitting the mixture oflatent trait analyzers model is difficult because the likelihood function involves anintegral that cannot be evaluated analytically. We focus on the variational approachthat works particularly well when the dimensionality of the data is large.

Start time 09:35

A WAVELET BASED APPROACH TO HPLC DATA ANALYSISJennifer Klapper and Dr. Stuart Barber

Department of Statistics, University of Leeds, UK

Keywords: Wavelets, High Performance Liquid Chromatography, Vaguelette-Wavelet

74

Page 75: 33rd Research Students' Conference in Probability and Statistics

High Performance Liquid Chromatography (HPLC) is a process by which chemi-cal compounds are separated into their constituent ingredients. The data producedby this type of experiment can be viewed as a time-dependent baseline with inter-mittent peaks. The locations of these peaks indicates which chemicals are presentand the area underneath each peak the quantity of the relevant chemical. Howeverthere are many issues which confound peak identification and quantification, theseinclude the presence of background noise in the data and baseline drift. These prob-lems, amongst others, mean that a certain amount of preprocessing is needed beforethe any type of quantification can take place. We use wavelet denoising techniquesto remove the background noise and eliminate the effects of baseline drift. We subse-quently use vaguelette-wavelet methods to estimate the derivatives of the data andthus locate the peaks within the data. Finally, numerical integration is used to calcu-late the areas under the peaks.

Start time 10:00

DELETE-REPLACE IDENTITY FOR A SET OF

INDEPENDENT OBSERVATIONSSakyajit Bhattacharya1, Brendan Murphy1 and John Haslett2

1 University College Dublin2 Trinity College Dublin

The delete-replace diagnostic method is developed in the context of a general modelof independent observations. If a set of observations is deleted then it is shown tobe estimated by the remaining observations. The identity is shown to be particularlytrue in case of a scalar sufficient statistic.In a multi-parameter case the delete-replace identity holds conditionally. As an ex-ample, the exponential family is explored and delete-replace is shown to be true fora one parameter exponential family. For a curved exponential family the necessaryand sufficient conditions for the delete-replace are derived.The estimate of the set of deleted observations is shown to depend only on the suf-ficient statistic. More particularly, the estimate comes out to be the maximum likeli-hood estimator of the parameter.The delete-replace holds only for independent set of observations. A counter exam-ple is derived for a set of dependent observations where the identity does not hold.

Start time 10:25

MODELLING MAIN CONTRACTOR STATUS FOR THE NEW

ORDERS SURVEYRia Sanderson and Salah Merad

Office for National Statistics, UK

75

Page 76: 33rd Research Students' Conference in Probability and Statistics

In the past, the New Orders survey sampled only main contractors; this populationcould be identified as the main contractor (MC) status was collected from an annualcensus. Following the transfer of construction statistics to the Office for NationalStatistics, the MC status is now collected through the Business Register and Employ-ment Survey (BRES). This change means that, for small businesses, the population ofMCs can no longer be identified (since only a very small number of these businessesare sampled as part of BRES) and hence all small businesses are eligible for selectionby the New Orders survey. One important consideration therefore is non-response,as it is unknown whether the non-response rate will be the same for both MCs andnon-MCs. In order to reduce potential non-response bias, we introduce a calibrationweight which requires an accurate estimate of the number of MCs in the popula-tion. We use data from BRES to build a model, and apply it to every business in thepopulation to give each a predicted probability of being a MC. The small numberof businesses in BRES means that we have only been able to construct a model thatyields accurate estimates of the number of MCs at high levels of aggregation. How-ever, there could be differential non-response within these levels. Therefore, in thefuture, we would like to make use of past data to update the predicted probabilities,which should allow for accurate estimates at lower levels of aggregation. In this talk,I will describe briefly the sampling design and the estimation method in the NewOrders survey, present some results from the modelling of the MC status, and thendiscuss data errors in the reporting of the MC status.

Start time 10:50

BAYES LINEAR KINEMATICS IN THE ANALYSIS OF FAILURE

RATESKevin Wilson

Newcastle University, UK

Keywords: Bayesian inference, Bayes linear kinematics, count data, failure rates

Collections of related Poisson counts arise, for example, from numbers of failuresin similar machines or neighbouring time periods. A conventional Bayesian analy-sis requires a rather indirect prior specification and intensive numerical methods forposterior evaluations.An alternative approach using Bayes linear kinematics in which simple conjugatespecifications for individual counts are linked through a Bayes linear belief structureis presented. The use of transformations of the Poisson parameters is proposed. Theapproach is illustrated using an example involving Poisson counts of failures.

76

Page 77: 33rd Research Students' Conference in Probability and Statistics

12.2.4 Session 4d: Graphical Models and Extreme Value Theory

Session Room: A1.01Chair: Guy Freeman

Start time 09:10

UNCERTAINTY IN CHOICE OF MEASUREMENT SCALE FOR

EXTREME VALUE ANALYSISJenny Wadsworth1, Jonathan Tawn1 and Philip Jonathan2

1 Lancaster University2 Shell Technology Centre Thornton

Keywords: Extreme Value Theory, Measurement Scale, Significant Wave Height

The effect of the choice of measurement scale upon inference and prediction fromextreme value models is examined. When measurements of the same process arerecorded on different scales linked by a non-linear transformation, separate extremevalue analyses carried out on the two scales can lead to highly discrepant conclu-sions concerning future extremes of the process. For some distributions it turns outthere is in fact an optimal choice of scale to minimise the bias of the model. Thistalk describes a how a Box-Cox transformation can be incorporated into an analysis,providing a parametric methodology to account for scale uncertainty. An exampledataset of significant wave height measurements is used to illustrate both the prob-lem and the new methodology.

Start time 09:35

MODELLING EXTREMAL PHENOMENA USING DIFFERENT

DATA SOURCESBen Youngman

University of Sheffield, UK

Keywords: Extreme value theory, Spatial modelling

A common problem in the modelling of extremes of phenomena is sparsity or qualityof data. This may be because few extremes have occurred or because extremes aredifficult to measure. A consistent source of non-observational data comes from nu-merical model output, eg. climate models. Typically these provide data of high spa-tiotemporal resolution, yet often poorly capture the behaviour of extremes. Here amethod is proposed to characterise this inaccuracy. This is done by relating the modeloutput to some proximate observational data, both of which theoretically quantifythe same phenomenon.

77

Page 78: 33rd Research Students' Conference in Probability and Statistics

Start time 10:00

PARAMETRISATION OF GRAPHICAL MODELSSimon Byrne

Statistical Laboratory, University of Cambridge, UK

Keywords: Graphical models, Bayesian inference, Covariance matrix estimation

Graphical models have recently become popular tools in statistics and related fields.A graphical model is a joint probability distribution which has certain conditionalindependence properties, known as Markov properties, based on the structure ofa graph. This graph provides both an aid to the human comprehension of complexmultivariate models, as well as a framework for efficient computation of the marginaland conditional distributions, either by exact, approximate or sampling based meth-ods.This talk will focus on the problem of efficiently parameterising families of such dis-tributions. If the parameters for the conditionally independent components them-selves have certain independence properties, so called “hyper Markov properties”,then the problem of parameter estimation, both in a maximum likelihood and Bayesianframework, can be simplifies by local computations. I will provide some examplesand applications of these properties.

Start time 10:25

BAYESIAN INFERENCE FOR SOCIAL NETWORK MODELSAlberto Caimo

University College Dublin, Ireland

Keywords: Exponential random graph models, MCMC methods, Bayesian inference

Exponential random graph models are widely used and studied models for socialnetworks. Despite their popularity, they are extremely difficult to handle from a sta-tistical viewpoint since their normalising constant is available only in very trivialcases. We propose to carry out the estimation using a Bayesian framework via theexchange algorithm of Murray et al. (2006), which circumvents the need to calculatethe normalising constants of the posterior density. Moreover we propose to furtherimprove mixing and local moves on the posterior support using a population MCMCapproach with snooker update. This method improves performance with respect tothe widely used Monte Carlo maximum likelihood estimation whose convergence isoften troublesome.

78

Page 79: 33rd Research Students' Conference in Probability and Statistics

12.2.5 Session 5a: Experimental Design and Population Genetics

Session Room: MS.01Chair: Andrew Simpkin

Start time 11:30

CANONICAL ANALYSIS OF MULTI-STRATUM RESPONSE

SURFACE DESIGNS & STANDARD ERRORS OF

EIGENVALUESMudakkar M. Khadim

School of Mathematical Sciences, Queen Mary University of London, UK

Keywords: Response surface methods, Canonical analysis, Eigenvalues, Multi-stratumDesign

Bisgaard and Ankenman described the double linear regression method to obtain thestandard errors for the eigenvalues in second order response surface models. Butthey discussed this method only for completely randomized error control structure.However, in many industrial experiments, experimenter might not be able to per-form complete randomization and hence might be forced to use the multi-stratumerror control structures of which the Split-plot design is a special case. We have triedto apply the same double linear regression model to multi-stratum error control struc-tures to get the standard errors for the eigenvalues in second order response surfacemodels.

Start time 11:55

D-OPTIMAL DESIGN OF EXPERIMENTS FOR A DYNAMIC

MODEL WITH CORRELATED OBSERVATIONSKieran Martin, Stefanie Biedermann, Susan Lewis, David Woods and EPSRC CASE

project supported by GlaxoSmithKlineUniversity of Southampton, UK

Keywords: experimental design, dynamic models

Models derived from differential equations occur frequently in the pharmaceuticalindustry. Optimal designs for these models are required to gather information formodel fitting. Finding such designs can be problematic: the models will usuallybe non-linear, making the optimal choice of design parameter dependent, and theobservations may be correlated. We aim to find designs which will find accurateestimates of the model parameters while remaining robust to the effects of correla-tion and parameter uncertainty. We find pseudo-BayesianD-optimal designs to meet

79

Page 80: 33rd Research Students' Conference in Probability and Statistics

these objectives, then use a simulation study to assess their robustness by calculat-ing the mean square error for each design. We demonstrate that the designs foundstill perform well when the domains of the prior parameter distributions are mis-specified.

Start time 12:20

VULNERABILITY: A 2ND CRITERION TO DISTINGUISH

BETWEEN EQUALLY-OPTIMAL BIBDSHelen Thornewell

Maths Department, University of Surrey, Guildford, UK

Keywords: Balanced Incomplete Block Designs (BIBDs), Disconnectedness, ObservationLoss, Optimality, Robustness, Selection, Vulnerability

If a Balanced Incomplete Block Design (BIBD) exists for the parameters, it is knownthat these designs are universally optimal. However, if there exists more than oneBIBD with the same parameters, is one design better than the other? Is optimality theonly criterion that needs to be tested at design selection? Are there ways of distin-guishing between non-isomorphic, equally-optimal BIBDs?Many experiments suffer from observation loss during the course of the experiment.This may result in a disconnected eventual design so that not all pairwise treatmentcomparisons can be estimated and the null hypothesis cannot be tested. In order toguard against poor eventual designs, I have introduced a Vulnerability Measure todetermine how likely a design is to becoming disconnected. The formulae dependon the design concurrences. Are some BIBDs more vulnerable than others?My new robustness criterion is compared to other criteria from literature. For exam-ple, Prescott & Mansson (2001) consider the robustness of designs against the lossof any two single observations, which depends on the block intersection sizes. Arethere combinatorial links between block intersections and concurrences? Is the leastvulnerable BIBD for disconnectedness also the most robust BIBD against the loss ofsingle observations? Does one criterion provide more information than the other forcomparison, selection and construction of BIBDs?General theorems, formulae and results will be presented and interactive examplesusing sets of complement BIBDs will be demonstrated in order to answer these ques-tions and more...

Start time 12:45

SURFING IN ONE DIMENSIONEmma Kershaw

University Of Bristol, Statistics Group

Keywords: Coalescent, Population Genetics, Stochastic Processes, Population Expansion

80

Page 81: 33rd Research Students' Conference in Probability and Statistics

Geographical expansions of a population have occurred throughout history, with hu-mans believed to have expanded out of Africa in the last 100,000 years. They are ofparticular interest in the field of evolutionary biology as they can have a drastic effecton the distribution and diversity of genes in the newly colonized area. Such geneticphenomena have been used as markers to indicate possible range expansions in thepast.This talk considers the phenomenon of genes surfing on the wave front of an expand-ing population in one dimension and we introduce some classical statistical popula-tion genetics models. Two simulations are introduced which explore the problemfurther. A forward-in-time model using classical population genetics theory enablesan exact ancestral graph of individuals at the wave front to be constructed and usedas a means of comparison for the second model, an approximate backward-in-timesimulation which attempts to estimate this ancestral distribution using methods ofcoalescent theory.

Start time 13:10

DIMENSION REDUCTION FOR HUMAN GENOMIC SNPVARIATION

Colette Mair and Dr. Vincent MacaulayUniversity of Glasgow, UK

Keywords: population structure, Wright’s island model

We will discuss ways of detecting population structure from genetic data from a setof individuals, each belonging to one population from a genetic collection of popula-tions. The main question of interest is whether the set of individuals belong to a largerhomogenous population or if the population can be segregated into subpopulationsthat are genetically distinct. This is important since a great deal of genetic analysisassumes independence of individual genotypes which may be violated through pop-ulation structure. As a result, not correcting for population structure can result inmisleading results. Further, discovering population structure can help understandthe demographic history of the populations of interest.One of the many issues with such studies is dealing with the large quantity of data.Over the last decade or so, SNP data are becoming widely available in vast quantities.This is the type of data we will consider throughout. A single nucleotide polymor-phism, or SNP, is a position in the DNA sequence which is known to be variable inthe populations of interest. Since we will be dealing with a large number of variables(SNPs), we will consider principal components analysis. This was first introducedto the study of genetic data over 30 years ago and is a common statistical tool forreducing the dimension of data to relatively few components but still accounting fora substantial part of the variation. Each component will capture a proportion of thepopulation structure present in the data, if any. Established software can be usedwhich, given such SNP data and using principal component analysis, can determine

81

Page 82: 33rd Research Students' Conference in Probability and Statistics

if population structure is present in the data. By observing a biplot from a real dataset and also using simulated data, correlations with geographical locations will beconsidered. Such correlations have been observed recently, for example, in Europe.We will firstly consider SNP data from the Human Genome Diversity Panel, consist-ing of roughly 1050 individuals from 50 countries all genotyped at around 650,000SNPs. From there, we will briefly consider simulated data under Wright’s islandmodel. With this model, simulation of SNP’s from a number of populations is pos-sible with the amount of migration between populations controlled. This simplifiedmodel will help illustrate the ideas presented but is only one of many possible mod-els. However it is useful in demonstrating population structure and correlations be-tween geographical and genetic distance.

12.2.6 Session 5b: Censoring in Survival Data and Non-Parametric Statistics

Session Room: MS.04Chair: Jennifer Rogers

Start time 11:30

PARAMETRIC SURVIVAL MODEL WITH TIME-DEPENDENT

COVARIATES FOR RIGHT CENSORED DATAHisham Abdel Hamid Elsayed

Statistics Group, School of Mathematics, University of Southampton, UK

Keywords: Parametric models, Right censoring, Splines, Time-dependent covariates

One standard approach in survival analysis is to use the Cox proportional hazardsregression model. This can easily be extended to incorporate one or more covari-ates whose values are subject to change over time. An alternative and potentiallymore efficient approach is to use simple parametric accelerated failure time mod-els with standard survival distributions such as the Weibull, log-logistic and log-normal. Again these models may be extended to incorporate time-dependent covari-ates. However, in some areas of medical statistics simple parametric models oftenfit poorly. In this paper the standard Weibull regression model is extended to in-corporate time-dependent covariates and made more flexible by using splines. Thecompeting methods are implemented and compared using two large data sets (sup-plied by NHS Blood and Transplant) of survival times of corneal grafts and hearttransplant patients.

82

Page 83: 33rd Research Students' Conference in Probability and Statistics

Start time 11:55

ASSESSING THE EFFECT OF INFORMATIVE CENSORING IN

PIECEWISE PARAMETRIC SURVIVAL MODELSNatalie Staplin

University of Southampton

Keywords: Survival analysis,Informative censoring,Sensitivity analysis,Parametricmodels,Piecewise exponential

Many of the standard techniques used to analyse censored survival data assumethat there is independence between the failure time and censoring processes. Thereare situations where this assumption can be questioned, especially when looking atmedical data. It would be useful to know whether we can assume independence orwhether we need a model that takes account of any dependence. The method pre-sented here assesses the sensitivity of the parameter estimates in parametric modelsto small changes in the amount of dependence between failure time and censoring.Parametric models with piecewise hazard functions are considered to allow a greateramount of flexibility in the models that may be fitted. In particular, piecewise con-stant hazard functions are considered, which means the piecewise exponential modelis being used. This method is applied to a dataset that follows patients registered onthe waiting list for a liver transplant. It suggests that in some cases even a smallchange in the amount of dependence can have a large effect on the results obtained.

Start time 12:20

DEALING WITH CENSORING IN QUALITY ADJUSTED

SURVIVAL ANALYSIS AND COST EFFECTIVENESS

ANALYSISHoward Thom

Biostatistics Unit, University of Cambridge, UK

Keywords: Cost Effectiveness Analysis, Health Economics, Censoring, Inverse ProbabilityWeighting, Bootstrapping

Estimation of average costs and quality adjusted life years is often complicated byheavy censoring in the data, as this censoring is implicitly informative. Simple em-pirical means are biased, and standard survival analysis methods are inappropriate.For the purposes of cost-effectiveness analysis, it is necessary to obtain unbiased es-timates of the means and variances of our quantities. This issue will be illustratedwith a contemporary example comparing the cost-effectiveness of four functionaldiagnostic tests in the diagnosis and management of coronary artery disease. The

83

Page 84: 33rd Research Students' Conference in Probability and Statistics

method of inverse-weighting will be applied to this example, and an analytic formfor variance estimates, derived by Willan et al, will be discussed in comparison witha simple bootstrap method.

Start time 12:45

NONPARAMETRIC PREDICTIVE INFERENCE FOR SYSTEM

RELIABILITYAhmad M AboalkhairDurham University, UK

Keywords: k-out-of-m systems, lower and upper probabilities, nonparametric predictiveinference, redundancy allocation, series-parallel systems, system reliability

Recently, the application of a novel statistical method called nonparametric predic-tive inference (NPI) to problems of system reliability has been presented. In NPI,relatively weak statistical modelling assumptions are made, which is made possibleby the use of lower and upper probabilities to quantify uncertainty, leading to infer-ences which are strongly based on observed data and which explicitly consider futureobservable events. Throughout this work, attention is on lower and upper probabili-ties for system functioning, given binary test results on components, as such it takesuncertainty about component functioning and indeterminacy due to limited test in-formation explicitly into account. Lower and upper probabilities, also known as im-precise probability, have several advantages over classical (precise) probability in re-liability context. Coolen-Schrijner et al (2008) considered systems that are series con-figurations of subsystems, with each subsystem a voting system (’k-out-of-m’ system)which consists of only one type of components, and different subsystems consistingof components of different types. They presented a powerful optimal algorithm forredundancy allocation for such systems, for the situation where components of alltypes have been tested with zero failures found in the tests. MacPhee et al (2009)generalized this to general test results. We present the basic results of NPI for sys-tem reliability, followed by a detailed presentation of optimal redundancy allocationfollowing general component test results, and outline related research challenges.

84

Page 85: 33rd Research Students' Conference in Probability and Statistics

Start time 13:10

NONPARAMETRIC ESTIMATION OF RELIABILITY OF TWO

RANDOM VARIABLES USING KERNEL ESTIMATION OF

DENSITYTomas Toupal

University of West Bohemia, Czech Republic

Keywords: Bivariate distribution, Nonparametric estimation, Reliability, Kernelestimation, Density and distribution function

In this talk there is discussed the problem of the reliability estimation particularly forthe bivariate distribution. In the real situations it may be used in many applications,especially in engineering concepts (as structures, static fatigue, the ageing of concretepressure vessels), medicine, quality control, military service or in a balance of pay-ments.The parametric estimation of a density and distribution function of reliability follow-ing a specified distribution has been discussed extensively in a literature. Hence, inthis talk I will present the kernel estimation of density and distribution function us-ing several types of kernels.In the final part I will use results of the previous estimation and I will demonstratehow to obtain the reliability of the obtained kernel estimation and apply it for theexperimental data of the balance of payments of the Czech Republic. In this case,the reliability is represented by a fact that the total amount of the expenditures is nothigher than the total income.The work has been supported by the grant of Ministry of Industry and Trade of theCzech Republic MPO 2A 2TP1/051.

12.2.7 Session 5c: Time Series and Diffusions

Session Room: MS.05Chair: Alexander Strawbridge

Start time 11:30

SEQUENTIAL INTEGRATED NESTED LAPLACE

APPROXIMATIONArnab Bhattacharya and Simon Wilson

Trinity College Dublin, Ireland

Keywords: Bayesian inference, Sequential methods

85

Page 86: 33rd Research Students' Conference in Probability and Statistics

This work addresses the problem of sequential inference of time series in real time,which will be improved further to deal with spatio-temporal models. The idea isto develop a fast functional approximation scheme so as to perform real-time dataanalysis of unknown quantities, given observations, which are dependent on someunderlying latent variable.The problem is defined as follows: the observed variables Yt, t ∈ N, Yt ∈ Y areassumed to be conditionally independent given the latent process Xt (assumed tobe a GMRF) and the unknown hyperparameters Θ, can have any distribution. Theprimary aim is to estimate the posterior distribution P (x0:t|y1:t, θ) and also the filter-ing density P (xt|y1:t, θ). The computation of these two terms necessarily requires theestimation of the posterior density of Θ. We are interested in providing sequential so-lutions for both P (θ|y1:t) and (xt|y1:t, θ). The new method is motivated by a recentlypublished technique known as Integrated Nested Laplace Transformation (INLA) de-veloped by by Rue et al, 2009. The procedure has already been implemented on Lin-ear Gaussian state-space models with unknown state of the system and covarianceparameters and has proved to be very accurate and fast. We consider implementingit in the generalized case where there is nonlinearity and non-Gaussianity.

Start time 11:55

FINDING CHANGEPOINTS IN A GULF OF MEXICO

HURRICANE HINDCAST DATASETRebecca Killick1, Idris Eckley1, Kevin Ewans2 and Philip Jonathan3

1 Maths & Stats, Lancaster University2 Shell International Exploration & Production, Netherlands

3 Shell Technology Centre Thornton, ChesterKeywords: Changepoints, Likelihood, Schwarz Information Criterion, Bayesian

Information Criterion, GOMOS

Statistical changepoint analysis is used to detect changes in variability within GO-MOS hindcast time-series for significant wave heights of storm peak events acrossthe Gulf of Mexico for the period 1900-2005. To detect a change in variance, thetwo-step procedure consists of (1) validating model assumptions per geographic lo-cation, followed by (2) application of a penalised likelihood changepoint algorithm.Results suggest that the most important changes in time-series variance occur in 1916and 1933 at small clusters of boundary locations at which, in general, the variance re-duces. No post-war changepoints are detected. The changepoint procedure is readilyapplied to other environmental time-series.

86

Page 87: 33rd Research Students' Conference in Probability and Statistics

Start time 12:20

PREDICTION INTERVALS OF THE LOCAL SPECTRUM

ESTIMATEKara Stevens

University of Bristol, UK

Keywords: Time series, locally stationary, Bayesian wavelet shrinkage, localizedautocovariance, local spectrum prediction intervals

Time series data occur in many disciplines such as finance and medicine. Often thereis a dependence structure between time series observations. The typical indicator ofthis dependence is the covariance function. If a time series is second order stationarythen the mean and variance are constant, and the covariance only depends on thetime difference between observations. However many time series are not stationary.One class of non-stationary time series are locally stationary time series that possessslowly evolving second order quantities, such as variance. In these cases models thatassume stationarity are inappropriate and alternative methods should be used.An interesting class are locally stationary wavelet models, which can be used to de-fine a localized autocovariance, calculated from an evolutionary wavelet spectrum.This is similar to the spectrum used to analyse stationary time series in the frequencydomain, but it is expressed within the wavelet domain and changes through time.The evolutionary wavelet spectrum is estimated from data through the wavelet peri-odogram. This quantity is asymptotically unbiased but not consistent.We have developed an empirical Bayesian wavelet shrinkage method to smooth thewavelet periodogram thus improve our estimation of the evolutionary wavelet spec-trum. Our method has the advantage of producing prediction intervals and probabil-ities associated with the evolutionary wavelet estimate. The new methodology willbe compared with current techniques.

Start time 12:45

DISCRETE- AND CONTINUOUS-TIME APPROACHES TO

IMPORTANCE SAMPLING ON DIFFUSIONSDavid Suda

University of Lancaster, UK

Keywords: stochastic calculus, Bayesian inference, computational statistics

In this talk we shall tackle the problem of importance sampling methods for diffu-sions. We first start by approximating an Ito diffusion by a discrete-time Markovchain using the Euler discetization, and then implementing importance sampling

87

Page 88: 33rd Research Students' Conference in Probability and Statistics

methods appropriate for discrete-time Markov chains. This setting is simpler to con-ceive as it only requires the understanding of the Radon-Nikodym derivative forfinite-dimensional distributions. We then look at the problem within a continuous-time context. In this case, one requires the understanding of the Radon-Nikodymderivative with respect to probability measures which are infinite-dimensional. Inactual practice, continuous-time importance sampling is never implemented exactly.However it will be useful in constructing new proposal densities, and it can alsoprove useful in analyzing the asymptotic behaviour of importance sampling weights.Some empirical results based on a simulation study of the above shall also be pre-sented.

Start time 13:10

BAYESIAN INFERENCE FOR DIFFUSIONS BASED ON EXACT

SIMULATIONIsadora Antoniano-Villalobos and Prof. Stephen Walker

University of Kent, UK

Keywords: Univariate diffusions, Exact Simulation, Bayesian non-parametric, Consistency

When a certain phenomena is modelled by means of a real-valued diffusion process,the model is often stated in terms of a stochastic differential equation. Statistical in-ference in this context is then aimed at the estimation of parameters appearing in thedrift and diffusion coefficients of the SDE. When exact simulation via MCMC is usedfor Bayesian estimation, the algorithm introduces latent variables which transformthe model into a Bayesian non-parametric model.In this framework, we propose a way of using the exact simulation algorithm forBayesian estimation of the parameters of a specific family of SDEs. We then studythe consistency of the resulting posterior densities of the parameters involved whenthe number of data points of a single diffusion path grows within a fixed time inter-val.

12.2.8 Session 5d: Probability

Session Room: A1.01Chair: Duy Pham

Start time 11:30

A NEW BIVARIATE GENERALIZED PARETO MODELAntonio A. Ortiz Barranon and Stephen Walker

University of Kent, UK

Keywords: Extreme Value Theory, Generalized Pareto Distribution

88

Page 89: 33rd Research Students' Conference in Probability and Statistics

Recently, Extreme Value Theory (EVT) has become a well developed area of research.However, some open problems in the multivariate case remain, since types of distri-butions present more complications, principally in the dependence structure. So far,the bivariate case is the main focus of the multivariate EVT. One of the concepts thatunderpin this theory is the tail dependence, which is a measure of the dependencebetween two variables given that one of them is extreme. Most of the approachesfound in the literature deal with the problem via the use of copulas.In the present project, we present a model not based on copulas. We deal with thedata with a simple parametric model that leads us to easier computation of the taildependence and that does not involve the difficulties that the copulas models haveshown.

Start time 11:55

BACKWARD INDUCTION AND SUBTREE PERFECTNESSNathan Huntley and Matthias C. M. Troffaes

Durham University, UK

Keywords: Sequential Decision Making, Backward Induction, Separability, PreferenceOrdering, Independence Principle, Normal Form Solutions

When studying solutions to sequential decision problems, an important propertyis subtree perfectness (also called separability and consistency). This states that,roughly, for any subtree of the decision tree, the solution of the subtree equals thesubtree of the solution. Commonly, solutions lacking subtree perfectness have thefollowing behaviour: the subject initially wants to choose X if he were to reach nodeN , but upon reaching N wants to choose Y . This is a significant conflict.Subtree perfectness is, however, a very restrictive property, requiring adherence to apreference ordering and the independence principle. We have found that a weakerform of subtree perfectness, admitting many more possible uncertainty and prefer-ence models, can be introduced. This essentially involves relaxing the ordering re-quirement while maintaining the independence principle. In this talk I will explainwhy this weakening may be acceptable, and make links with backward induction.

Start time 12:20

ON THE CONVERGENCE OF CONTINUOUSLY MONITORED

BARRIER OPTIONS UNDER MARKOV PROCESSESRui Xin Lee and Dr. Vassili Kolokoltsov

University of Warwick, UK

Keywords: Barrier options, Markov chains, Feller process, exit probabilities for continuoustime Markov chains, infinitesimal generator

89

Page 90: 33rd Research Students' Conference in Probability and Statistics

We consider a general barrier option for which expected discounted random cashflow is modelled as

g(ST )I{τA T} + h(SτA)I{τA≤T}where St, t ≥ 0 is a random price process, IC donates the indicator of set C, τA =inf{t ≥ 0 St ∈ A}, g denotes non-negative payoff, h denotes reabate function, A de-notes knock-out range.Given barrier options prices under a given Feller price process (St)t≥0 equipped withcorresponding generator L, Mijatovic and Pitorious (2009) present a novel approxi-mation algorithm by constructing a finite-state continuous-time Markov chains (X(n))so that its generator X is close to L, its law is close to that of (St)t≥0 and its expectedpayoffs approximate S = {St}t≥0.We build on the work in Mijatovic and Pitorious (2009). We study the convergenceof such sequence of finite-state continuous-time Markov chains to S = {St}t≥0 andestablish its rates of convergence.

Start time 12:45

DISTORTION OF PROBABILITY MODELSEva Wagnerova

University of West Bohemia in Pilsen, Czech Republic

Keywords: distortion functions, choice of a model, correction

The choice of a suitable model and its description with a probability distribution isthe beginning of every statistical inference. However, the data do not always fol-low typical (textbooks’) probability distributions. There is a possible solution to thatproblem – to use a distortion function to correct the model.The distortion function is a non-decreasing mapping of the interval [0, 1] into itself. Itis a tool to transform distribution functions. This means it can be used already at thebeginning of the modelling, too. Some useful modifications of goodness-of-fit testsare possible to construct through distortions.In our presentation, we demonstrate some noted distortion functions and their us-age. We show examples of suitable distortion function upon the choice of the model,too.

90

Page 91: 33rd Research Students' Conference in Probability and Statistics

12.2.9 Session 6a: Sponsors’ Talks

Session Room: MS.01Chair: Jennifer Rogers

Start time 14:30

THE INTERNATIONAL BIOMETRIC SOCIETY: WHAT CAN

IT OFFER TO POSTGRADUATE STUDENTS?Richard Emsley

International Biometric Society

This talk will introduce the International Biometric Society, which promotes the de-velopment and application of statistical and mathematical theory and methods inthe biosciences. We discuss how the Society was founded by eminent statisticiansof the day, and how it has now evolved into a truly international society. We focuson the opportunities available to postgraduate students within the International Bio-metric Society, including the FREE student membership, the activities of the Britishand Irish Region, and details of the 2010 International Biometric Conference takingplace in Brazil in December this year.

Start time 15:05

BAYESIAN DESIGN & ANALYSIS OF EXPERIMENTSPhil Woodward

Pfizer

Bayesian approaches are becoming widely used in the Pharmaceutical Industry, par-ticularly in the earlier stages of drug discovery and development. This talk willpresent on current uses of these methods at Pfizer. It will show how the objectivesof the studies are quantified using the Bayesian probability concept, and how priorknowledge concerning the efficacy of the compounds being tested is formally usedto assess the operating characteristics of the study design. It will also illustrate howmore efficient studies have been designed by incorporating the formal use of suchprior knowledge in the analysis.

Start time 15:40

AN INTRODUCTION TO FOOTBALL MODELLING AT

SMARTODDSRobert Mastrodomenico

SmartOdds

91

Page 92: 33rd Research Students' Conference in Probability and Statistics

Sports modelling presents modern statistics with many interesting and complex prob-lems. As well as the challenge of building models with high predictive utility, thereis also a computational challenge associated with calibrating the models given thevast data sets now available across a wide range of sports. This talk describes someof the work we do at Smartodds by providing an introduction to football modelling,and all the associated problems and challenges. We begin by introducing some ofthe earlier published work in this area, in particular focusing on using generalisedlinear models to model the goals scored by each team in a football match. We discussa range of modelling challenges that typically arise in the field of sports modelling,such as how to take account of home field advantage, how to allow for the differentstrengths of teams, and how to describe the variable nature of team strengths overtime. Following this we discuss what it is like to work for Smartodds, and mentionsome other sports which we are actively researching.

12.2.10 Session 6b: Sponsors’ Talks

Session Room: MS.04Chair: Mouna Akacha

Start time 14:30

MAKING DECISIONS WITH CONFIDENCE - STATISTICS

THE SHELL WAYWayne Jones

Shell

Shell’s Statistics and Chemometrics group provides research and consultancy ser-vices in data analysis and visualisation, statistical modelling, experimental designand statistical software tool development to many Shell businesses in the fields ofcommerce, finance, process development and product development. The group,based at Amsterdam, Chester and Houston, serves clients world-wide.

Start time 15:05

AN INTRODUCTION TO AHLMartin Layton

AHL, Man Group PLC

In this presentation I will talk about AHL, a quantitative hedge fund with a 20 yeartrack record of profitably trading financial markets using model-based, systematicapproaches. After introducing AHL, I will walk through the process of creating andevaluating a simple trading system. Finally, time permitting, I will talk through someof the current areas of research within our group.

92

Page 93: 33rd Research Students' Conference in Probability and Statistics

12.2.11 Session 6c: Sponsors’ Talks

Session Room: MS.05Chair: Flavio B Goncalves

Start time 15:05

SUPPORT FROM THE RSS AND THEIR YOUNG

STATISTICIANS SECTIONHelen Thornewell

Young Statisticians Section

The Royal Statistical Society is the professional body for statistics and statisticians inthe UK. The presentation will remind you about the different memberships availableas well as courses and qualifications on offer to support YOU. In particular, informa-tion will be presented about the RSS Young Statisticians Section, including its aims &objectives, a summary of successes since its official launch at the start of 2009, waysto get involved and adverts for upcoming events. Come and find out more aboutYOUR section

Start time 15:40

OPPORTUNITIES IN PROBABILITY AND STATISTICAL

MODELLING AT LLOYDS BANKING GROUP DECISION

SCIENCEBill Fite

Lloyds Banking Group

‘There are no problems, only opportunities’ - Jacques Benacin

93

Page 94: 33rd Research Students' Conference in Probability and Statistics

13 Poster Abstracts by Author

NONPARAMETRIC PREDICTIVE INFERENCE FOR SYSTEM

FAILURE TIMEAbdullah Al-NefaieeDurham University, UK

Keywords: Lower and upper probabilities, Nonparametric predictive inference, Systemreliability

Nonparametric predictive inference (NPI) is a recently developed statistical frame-work which makes few modelling assumptions and uses lower and upper proba-bilities to quantify uncertainty. Throughout, we consider the use of NPI to predictreliability of systems, given failure times of tested components which are exchange-able with components used in the system considered. We present some main ideas,and these ideas are illustrated and discussed via examples. We also include a briefoutline of main research challenges.

A COMPARISON OF BAYESIAN SPACE-TIME MODELS FOR

OZONE CONCENTRATION LEVELSKhandoker Shuvo Bakar

School of Mathematics, University of Southampton

Keywords: Space-time modelling, ozone centrations, auto-regressive model, dynamic linearmodel, Bayesian spatial prediction

Recently, there has been a surge of interest in space-time modelling of ozone con-centration levels. Well known time series modelling methods such as the dynamiclinear models (DLM) and the auto-regressive (AR) models are being used togetherwith the Bayesian spatial prediction (BSP) methods adapted for dynamic data. Asa result, the practitioners in this field often face a daunting task of selection amongthese methods. This paper presents a study comparing three approaches: the DLMapproach of Huerta et al. (2004), the BSP method as described by Le and Zidek (2006),and the AR models proposed by Sahu et al. (2007). Recent theoretical results (Dou etal., 2009) comparing the first two approaches are extended to include the AR mod-els. The results are illustrated with a realistic numerical simulation example usinginformation regarding the location of the ozone monitoring sites and observed ozoneconcentration levels in the state of New York in 2005-2006 for months June and July.

94

Page 95: 33rd Research Students' Conference in Probability and Statistics

The speed of computation, the availability of high-level software packages for imple-menting the methods, and the practical difficulties for using the methods for largespace-time data sets are also investigated.

BIAS IN MENDELIAN RANDOMIZATION FROM WEAK

INSTRUMENTSStephen Burgess and Simon G. Thompson

MRC Biostatistics Unit, University of Cambridge

Keywords: Genetic epidemiology, Mendelian randomization, Causality, Weak instruments,Finite sample bias

A common epidemiological question of interest is whether an observed correlationbetween a risk factor and a disease is a true causal association. Mendelian randomiza-tion is a technique for determining the causal association between a risk factor and anoutcome in the presence of several possibly unmeasured confounders. A genetic vari-ant is sought, by means of which, under certain assumptions, a causal association canbe estimated. However, even when the necessary underlying assumptions are valid,estimates from analyses using genetic variants which are not strongly associated withthe risk factor are biased. This bias, which acts in the direction of the observationalassociation between risk factor and disease, if not correctly acknowledged, may con-vince a researcher that an observed observational association is causal, when in factthere is no true association.

USING DYNAMIC STAGED TREES FOR DISCRETE TIME

SERIES DATA: ROBUST PREDICTION, MODEL SELECTION

AND CAUSAL ANALYSISGuy Freeman and Jim Q. Smith

University of Warwick, Coventry, UK

Keywords: Staged trees, Bayesian model selection, Bayes factors, forecasting, discrete timeseries, causal inference, power steady model, multi-process model

The class of chain event graph models is a generalisation of the class of discreteBayesian Networks, retaining most of the structural advantages of the Bayesian Net-work for model interrogation, propagation and learning, while more naturally encod-ing asymmetric state spaces and the order in which events happen. We demonstratehere how with complete sampling, conjugate closed form model selection based on

95

Page 96: 33rd Research Students' Conference in Probability and Statistics

product Dirichlet priors is possible for this class of models. We demonstrate ourtechniques using a simple educational example, and go on to discuss possible futureenhancements to and applications of this model class.

FINDING CHANGEPOINTS IN A GULF OF MEXICO

HURRICANE HINDCAST DATASETRebecca Killick1, Idris Eckley1, Kevin Ewans2 and Philip Jonathan3

1 Maths & Stats, Lancaster University2 Shell International Exploration & Production, Netherlands

3 Shell Technology Centre Thornton, ChesterKeywords: Changepoints, Likelihood, Schwarz Information Criterion, Bayesian

Information Criterion, GOMOS

Statistical changepoint analysis is used to detect changes in variability within GO-MOS hindcast time-series for significant wave heights of storm peak events acrossthe Gulf of Mexico for the period 1900-2005. To detect a change in variance, thetwo-step procedure consists of (1) validating model assumptions per geographic lo-cation, followed by (2) application of a penalised likelihood changepoint algorithm.Results suggest that the most important changes in time-series variance occur in 1916and 1933 at small clusters of boundary locations at which, in general, the variance re-duces. No post-war changepoints are detected. The changepoint procedure is readilyapplied to other environmental time-series.

ON THE CONVERGENCE OF CONTINUOUSLY MONITORED

BARRIER OPTIONS UNDER MARKOV PROCESSESRui Xin Lee and Dr. Vassili Kolokoltsov

University of Warwick, UK

Keywords: Barrier options, Markov chains, Feller process, exit probabilities for continuoustime Markov chains, infinitesimal generator

We consider a general barrier option which expected discounted random cash flowis modelled as

g(ST )I{τA T}+ h(SτA)I{τA≤T}where St, t ≥ 0 is a random price process, IC donates the indicator of set C, τA =inf{t ≥ 0 St ∈ A}, g denotes non-negative payoff, h denotes reabate function, A de-notes knock-out range.Given barrier options prices under a given Feller price process (St)t≥0 equipped with

96

Page 97: 33rd Research Students' Conference in Probability and Statistics

corresponding generator L, Mijatovic and Pitorious (2009) present a novel approxi-mation algorithm by constructing a finite-state continuous-time Markov chains (X(n))so that its generator X is close to L, its law is close to that of (St)t≥0 and its expectedpayoffs approximate S = {St}t≥0.We build on the work in Mijatovic and Pitorious (2009). We study the convergenceof such sequence of finite-state continuous-time Markov chains to S = {St}t≥0 andestablish its rates of convergence.

MULTI-ARMED BANDIT WITH REGRESSOR PROBLEMSBenedict May and Dr. David Leslie

University of Bristol, UK

Keywords: Bandit Problem, Reinforcement Learning, Linear Regression, NonparametricRegression

The multi-armed bandit problem is a simple example the exploitation/explorationtrade-off generally inherent in reinforcement learning problems. An agent is taskedwith learning from experience how to sequentially make decisions in order to max-imize average reward. In the extension considered, the agent is presented with aregressor before making each decision. The agent has to balance the tendency toexplore apparently sub-optimal actions (in order to improve regression function es-timates) against the tendency to exploit the current estimates (in order to maximisereward). Study of several past approaches to similar problems has indicated particu-lar desirable properties for the policy used. These properties motivate the choice andstudy of the algorithm that features in this work. The theoretical properties of thealgorithm have been studied and it has been tested on both linear and nonparametricregression problems. The intuitive algorithm has useful convergence properties and,compared to many conventional methods, performs well in simulations.

ADAPTIVE ANALYSIS AND DESIGN OF MULTIVARIATE

NORMAL RESPONSE STUDY WITH APPLICATION IN FMRISTUDIES

Giorgos Minas, Dr. F. Rigat, Dr. J. Aston, Prof. N. Stallard and Dr. T.NicholsDepartment of Statistics, University of Warwick

Keywords: Multivariate Normal Distribution, Power, prior/posterior distribution, MonteCarlo approximation

97

Page 98: 33rd Research Students' Conference in Probability and Statistics

We propose a two-stage adaptive design for a study with multivariate normal re-sponse where an overall effect is way more important than local effects. A linearcombination of the marginals of the second-stage response is the main endpoint. Theweights of the linear combination are chosen using the pilot data of the first stagesuch that power is maximised. Power is defined as the expectation of the rejectionprobability for the z-test (or t-test) of the linear combination where expectation istaken over the posterior distribution of the mean (and variance if unknown) of themultivariate response. The analytic expression for the optimal weighting under anidentifiability constraint is given. The power under the optimal weighting is approx-imated using Monte Carlo approximation and sample size requirements for the twostages are provided. Application in fMRI studies is explored.

BAYESIAN ANALYSIS IN MULTIVARIATE DATARofizah Mohammad and Dr. Karen Young

University of Surrey, UK

Keywords: Model choice, Bayes factors, Influential observations

In this study, we consider Bayesian model selection in multivariate normal data usingthe well-known Bayes factor. The standard improper priors are used for the param-eter model. The device of imaginary observations is used to determine the ratio ofunspecified constant in the Bayes factors. We discuss a few different models. Thediagnostic kd is used to assess the influential observation on model choice based onBayes factors method. The calculations are illustrated using simulation data and Irisdata sets.

MINKOWSKI FUNCTIONAL IN IMAGE ANALYSISNoratiqah Mohd Ariff and Dr. Elke Thonnes

University of Warwick, UK

Keywords: Minkowski functional, Boolean model

Various lung diseases, such as emphysema or pulmonary fibrosis, lead to structuraldeformations in lung tissue. These become apparent as textural changes in high res-olution CT scans of the lung. One natural set of descriptors that may be used toquantify textural changes are the so-called Minkowski functionals or intrinsic vol-umes from integral geometry. These are related to more commonly known mea-sures of shape, curvature and connectivity. In this work, methods of computingthe Minkowski functionals from digital images are discussed and their accuracy aretested via standard models in stochastic geometry where the mean Minkowski func-tionals are already known analytically.

98

Page 99: 33rd Research Students' Conference in Probability and Statistics

EXACT DISTRIBUTIONS AND SEQUENTIAL MONTE CARLO

FOR CHANGE POINTSChristopher Nam, John Aston and Adam Johansen

Department of Statistics, University of Warwick, UK

Keywords: Change Point analysis, Hidden Markov Models, Finite Markov ChainImbedding, Sequential Monte Carlo Samplers

Quantifying the uncertainty in the locations of change points is a topic of increasinglysignificant interest with various application areas including economics and genetics.This poster will review an existing methodology in calculating change point distri-butions using general finite state Hidden Markov Models (HMMs) for a sequence ofdata. A change point is defined to have occurred when a run of a particular state hasoccurred consecutively for at least a desired number of time periods. This method-ology generates exact distributions for the location of change points for particularparameter values using Finite Markov chain Imbedding (FMCI). The use of FMCIextends the original posterior Markov chain to a new Markov chain, such that theprogress of any particular run can also be recorded within the state space. This ul-timately allows the probability distribution function to be characterised completelywithout requiring any asymptotic arguments or being influenced by sampling error.However, as these parameter estimates are themselves subject to uncertainty, themethodology is extended to generate samples from the parameter distributions usingSequential Monte Carlo (SMC). This in turn allows for a more complete characterisa-tion of the distribution of change points to be computed. The extended methodologybenefits from the use of exact conditional distributions within the SMC, and thus be-ing computationally more efficient than other approaches where state estimates foreach time point are required.

SAMPLE SIZE RE-ESTIMATION IN CLINICAL TRIALS WITH

MULTIPLE ENDPOINTSIves Ntambwe, Tim Friede and Nigel Stallard

Warwick Medical School, University of Warwick, UK

Keywords: Bonferroni, multiple endpoints, sample size re-estimation, familywise error rate

The choice of an appropriate sample size is a main concern in the design of any clin-ical trial. In the planning stage of a trial one is often quite uncertain about the sizesof parameters or assumptions needed for sample size calculations. The idea of this

99

Page 100: 33rd Research Students' Conference in Probability and Statistics

project is to explore the use of designs that allow checking of these assumptions andadjustment of the sample size if necessary.Designs with sample size re-estimation, also called designs with internal pilot study(IPS), are conducted to look at assumptions regarding the nuisance parameters.Multiple endpoints are not uncommon in clinical research. One example is the useof a test battery in schizophrenia. The analysis is complicated in the presence of mul-tiple endpoints and special techniques are needed to control the Type I error ratebecause hypotheses are tested for various endpoints.This project aims to bring together the concept of designs with sample size re-estimationand the methodology for dealing with multiple endpoints. This will provide an ex-tension of the current methodology for single endpoint sample size re-estimation tothe multiple outcomes setting.Preliminary results have been based on the use of a Bonferonni correction and showthat despite misspecification of the nuisance parameters at the planning stage, thepower is maintained when performing sample size re-estimation.

MODELLING AIR POLLUTION AND ITS RELATIONSHIP TO

HEALTHOyebamiji Oluwole, Dr. Alison Gray and Prof. Chris RobertsonMathematics and Statistics, University of Strathclyde, Glasgow, UK

Keywords: Air pollution, Spatial and temporal modelling, Time series

Atmospheric pollution is any substance capable of altering the natural compositionof air and causing harm to both humans and their environment. The adverse effectsof airborne pollutants upon human health have been well established. The aim of thecurrent work is to model sulphur dioxide (S02) levels in Scotland and to relate theseto health. The method we are adopting is to concentrate on systematic trend sur-faces which provide basic descriptions of the patterns of the data both spatially andtemporally by incorporating these two attributes in a generalized additive regressionmodel.The study uses S02 data from 41 stations monitoring air pollutants over Scotland, ob-tained from the UK Air Quality Archive data website (www.airquality.co.uk/data).The data used covers the years 1996-2007, comprising 3653 days in the entire studyperiod. The data represent daily mean S02 concentrations. We also have data on thegeographical locations of the sites (Easting and Northing). There is missing data andnot all stations have measurements for all the years. Descriptive analysis of the datahave been carried out and the results of investigating ARMA and ARIMA modelsfor time series modelling and imputation of missing data will be shown. Furthermodelling will involve use of generalized additive models to incorporate the dataattributes of both space and time, before linking the modelled S02 levels to variablesconcerning human health.

100

Page 101: 33rd Research Students' Conference in Probability and Statistics

INFERENCE ABOUT THE RATIO OF TWO NORMAL MEANS

FOR PAIRED OBSERVATIONSFrancisco J. Rubio

University of Warwick, UK

Keywords: Normal Ratio Distribution, Paired Observations, Reference Analysis

In order to make inferences about the ratio of Normal means β in the case of pairedindependent Normal random variables (X, Y ), appropriate statistical models havebeen given in statistical literature. However, in other scientific disciplines such asCytometry, Physiology, and Medicine, the distribution of the corresponding ratio ofthe Normal variables or a Normal approximation to this distribution are used to es-timate β. It has been reported (Merril, (1928), Marsaglia, (1965), Kuete et. al. (2000))that the distribution of the ratio of two independent Normal variables Z = X/Ycould be bimodal, or asymmetric, or symmetric, or similar to a Normal distributionunder some conditions on the parameters. These conditions have been settled downthrough simulations and empirical results. In the revised literature there is a lack ofassessment of the error made with these procedures. The goal of the present work isto quantify and characterize this error in terms of certain conditions on the parame-ters of the Normal variables. In addition, a result about the existence of a Normalapproximation to the distribution of Z when the means of X and Y are positive ispresented. Finally, the reference posterior distribution of the ratio of two positiveNormal means is analysed.

WAVELET METHODS FOR BRAIN IMAGING ANALYSISYiqin Shen and Dr. J.A.D. Aston

Department of Statistics, University of Warwick

Keywords: Wavelet, Brain Imaging Analysis, fMRI, pre-whitening

The human brain can now be studied using neuroimaging techniques such as func-tional Magnetic Resonance Imaging (fMRI). fMRI data is four dimensional; three spa-tial dimensions and one temporal dimension. When modelling the time dimension,the traditional way is to estimate linear model parameters using least squares modelfitting. An alternate way is proposed in the following steps: first, do a wavelet trans-form for the data in the space dimension; second, on the coarsest wavelet coefficients,estimate parameters using standard linear regression. These parameters can then beused to construct a prior to estimate the parameters in the rest of the hierarchical

101

Page 102: 33rd Research Students' Conference in Probability and Statistics

wavelet structure by Bayesian linear regression. A normal prior, scaled using the pre-vious parameter estimates, is used, as the noise increases at higher resolutions andthe Bayesian framework consequently bounds the estimates and smoothly shrinksthe estimated parameters towards zero (helping remove noise); third, apply the in-verse wavelet transform for the estimated parameters and thus obtain the parametersfor all locations in the original image space.The errors are not independent in time, so it is necessary to pre-whiten the data anddesign matrices using the autocorrelation of each time series. However, when usingwavelet method with autocorrelations in different time series model, the design ma-trices of each spatial location need to be kept identical. Thus, we examine throughsimulation an approximation using a global value of autocorrelation for the designmatrices and apply this to real data.

102

Page 103: 33rd Research Students' Conference in Probability and Statistics

14 RSC 2011: Cambridge University

34th Research Students’ Conference in

Probability and Statistics

Cambridge

4th - 7th April 2011

Centre for MathematicalSciences

[email protected]

103

Page 104: 33rd Research Students' Conference in Probability and Statistics

15 Sponsors’ Advertisements

`

This year the conference is being financially supported by CRiSM. CRiSM (Centre for Research in Statistical Methodology) is an EPSRC supported initiative to build capacity in Statistics within UK. It lives within the Statistics Department at Warwick, and funds three academic positions, five postdoctoral research associates and many PhD students. In addition it organises many workshops and conferences and has an energetic visitor programme. Its director is Gareth Roberts. Further information about its activities can be found at http://www2.warwick.ac.uk/fac/sci/statistics/crism/

Forthcoming CRiSM Workshops:

Tue, Apr 20, '10

CRiSM Workshop: Continuous-time and continuous space processes in ecology

Runs from Tuesday, April 20 to Wednesday, April 21.

Sun, May 30, '10 CRiSM Workshop: Model Uncertainty

Runs from Sunday, May 30 to Tuesday, June 01.

Mon, Jul 12, '10

CRiSM Workshop: Orthogonal Polynomials and Application in Statistics and Stochastic Processes

Runs from Monday, July 12 to Thursday, July 15.

Mon, Mar 28, '11

CRiSM Workshop: InFer (Inference for Epidemic-related Risk)

Runs from Monday, March 28 to Friday, April 01.

104

Page 105: 33rd Research Students' Conference in Probability and Statistics

ATASS Sports is a leading statistical research

consultancy business providing clients with

high‑quality sports models and predictions.

We are currently looking to fill a range of full‑time

positions to work as part of our existing and

planned research teams. We have both senior and

junior positions available for applied statisticians,

mathematical modellers, database managers

and IT support staff. Generous salary and benefits

packages are the norm and depend upon position,

qualifications and experience. All posts will be

based at our newly developed office complex on

the Exeter business park. Our new recruits will

work within close‑knit multi‑skilled teams to model

a variety of sports, obtaining and incorporating real‑

time information, and developing and applying novel

statistical and mathematical modelling techniques.

The closing date for this round of appointments

is April 22nd 2010. Applications or requests

for further information should be addressed to

Steve Brooks via [email protected]. Additional

information can also be found on our web site –

www.atassltd.co.uk.

ATASS is committed to equality and values diversity.

We welcome applications from all suitably qualified

individuals ‑ see web site for details.

Statistical Modelling Vacancies

www.atassltd.co.uk

Innovation in Sports Modelling

105

Page 106: 33rd Research Students' Conference in Probability and Statistics

Work for AHL in OxfordFor more information visit www.ahl.com

Candidates must have a PhD or equivalent qualifi cation in a quantitative discipline

Get the benefi ts of a City careerwhilst working alongside world class academics

106

Page 107: 33rd Research Students' Conference in Probability and Statistics

Statistics

Then play a vital role in developing new life-saving, life-enhancing drugs for people worldwide. Join our Statistics group at Pfizer, and you’ll help one of the world’s largest pharmaceutical companies with the discovery, development and trial of new molecules and compounds that improve lives.

your reach?Want to extend

107

Page 108: 33rd Research Students' Conference in Probability and Statistics

Statistics and ChemometricsMaking decisions with confidence

About us

The Statistics and Chemometrics team includes statisticians, data analysts,

chemometricians and modellers who help clients in the commerce, finance,

process and product development industries to develop better business solutions.

The team draws on Shell Group experience of providing cutting-edge

consultancy, software, innovation and training for more than 30 years to serve

clients worldwide from bases in the UK, the Netherlands and the USA.

Website:

http://www.shell.com/globalsolutions/statisticsandchemometrics

Email:

[email protected]

108

Page 109: 33rd Research Students' Conference in Probability and Statistics

Shell Global Solutions is a network of independent technology companies in the Shell

Group. In this case study, the expressions ‘Shell’ and ‘Shell Global Solutions’ are sometimes

used for convenience where reference is made to these companies in general, or where no

useful purpose is served by identifying a particular company.The information contained in

this material is intended to be general in nature and must not be relied on as specific advice

in connection with any decisions you may make. Shell and Shell Global Solutions are not

liable for any action you may take as a result of you relying on such material or for any loss

or damage suffered by you as a result of you taking this action. Furthermore, these materials

do not in any way constitute an offer to provide specific services. Some services may not

be available in certain countries or political subdivisions thereof. Photographs are from

various installations. Copyright © 2008 Shell Global Solutions International BV. All rights

reserved. No part of this publication may be reproduced or transmitted in any form or by

any means, electronic or mechanical including by photocopy, recording or information

storage and retrieval system, without permission in writing from Shell Global Solutions

International BV.

GS1427690308-En(A)

Process Solutions

� Statistical process modelling

� Process solutions and software for optimising

performance and operating cost, e.g. pre-heat units

� Process software for assuring integrity of pipework

� Tools and techniques to emulate and optimise

process conditions

Product Development

� Supporting product development in fuels,

lubricants, chemicals: e.g. vehicle testing,

emission testing

� Designing experiments to provide evidence

to support marketing claims

� Analysing data to help understand effects

� Collaborating with product teams to provide

on going support

Training and Software

� Customised statistical training courses -

statistics and design of experiments training

� Customised training on use of specialist

software tools

Chemometrics

� Process Analytical Chemistry using advanced spectroscopy with multivariate calibration models,

e.g. for MOGAS blending

� Advanced Process Monitoring using multivariate statistical techniques, e.g. Dynamic Chemical

Processes

� Enhanced Experimentation, e.g. catalyst characterisation Electron Microscopy, X-Ray Analysis,

Kernels, Multivariate Analysis

Business Solutions

� Statistical forecasting

� Decision tools on Carbon management

� Risk and uncertainty modelling

� Benchmarking advanced data analysis

109

Page 110: 33rd Research Students' Conference in Probability and Statistics

advert.indd 1 10/3/10 03:22:10

110

Page 111: 33rd Research Students' Conference in Probability and Statistics

Sign up to our mailing listto win a Sony e-ReaderCarry 100s of electronic books on a slim, lightweight digital Sony e-book readerand access your essential Wiley collection wherever and whenever you want.

We’re offering a fantastic opportunity to win a Sony eBook reader*.For your chance to win simply sign up to our mailing list atwww.wiley.com/go/win

Entrants will be entered into an additional monthly draw for thechance to win a £100 Wiley voucher to spend on Wiley books.

www.wiley.com/go/win

*Full terms and conditions can be found at www.wiley.com/go/win

WIN!

111

Page 112: 33rd Research Students' Conference in Probability and Statistics

20% conference discount on selected Probability

and Statistics titles

For a limited time only!

Coming this summer:

Bayesian Decision Analysis Principles and Practice

Jim Q. Smith Hardback 9780521764544 | GBP c. 35.00

www.cambridge.org

To view all the titles included in this offer please visit www.cambridge.org/RSC2010

112

Page 113: 33rd Research Students' Conference in Probability and Statistics

www.rss.org.uk/rss2010

RSS 2010International Conference

Brighton, 13–17 September

Topics include� climate change modelling� trust in official statistics� 2011 census� measuring progress� adaptive clinical trials� performance indicators� composite likelihood� MCMC� data capture� risk� statistical literacy

A forum for presentation and discussionof methodological developments andareas of application for statisticians andusers of statistics.

An opportunity for statisticians, analysts, researchersand other users of statistics from all sectors to sharecurrent research and insights.

Confirmed speakers� Tim Davis (Jaguar Land Rover)� Peter Donnelly (University of Oxford)� Nancy Reid (University of Toronto)

Key dates19 April Extended deadline for RSC attendees24 May Deadline for grant and bursary applications7 June Deadline for ‘double discount’ registration9 August Deadline for discounted registration8 September Deadline for pre-event registration

RSS2010_AD_A5_portrait 15/3/10 14:23 Page 1

113

Page 114: 33rd Research Students' Conference in Probability and Statistics

114

Page 115: 33rd Research Students' Conference in Probability and Statistics

16 RSC History

34 2011 Cambridge33 2010 Warwick32 2009 Lancaster31 2008 Nottingham30 2007 Durham29 2006 Glasgow28 2005 Cambridge27 2004 Sheffield26 2003 Surrey25 2002 Warwick24 2001 Newcastle23 2000 Cardiff and University of Wales22 1999 Bristol21 1998 Lancaster20 1997 Glasgow19 1996 Southampton18 1995 Oxford17 1994 Reading16 1993 Lancaster15 1992 Nottingham14 1991 Newcastle13 1990 Bath12 1989 Glasgow11 1988 Surrey10 1987 Sheffield(9)(8) 1985 Oxford(7) 1984 Imperial College(6)(5)(4)(3) 1982 Cambridge(2) 1981 Bath(1) 1980 Cambridge

Table 1: RSC History

115

Page 116: 33rd Research Students' Conference in Probability and Statistics

17D

eleg

ate

List

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Ahm

adM

oham

mad

Abo

alkh

air

Dur

ham

Uni

vers

ity

2a.

m.a

boal

khai

r@du

rham

.ac.

ukN

onpa

ram

etri

cPr

edic

tive

Infe

renc

e(N

PI)-

Syst

emR

elia

bilit

y

Hol

lyA

insw

orth

New

cast

leU

nive

rsit

y1

h.f.a

insw

orth

@nc

l.ac.

ukBa

yesi

anM

odel

ling

for

Ecol

ogy

Mou

naA

kach

aU

nive

rsit

yof

War

wic

k3

M.A

kach

a@w

arw

ick.

ac.u

kLo

ngit

udin

alD

ata,

Mis

sing

Dat

a,N

on-L

inea

rM

ixed

Mod

els

Faiz

aA

liD

urha

mU

nive

rsit

y2

f.f.a

li@du

r.ac.

ukba

yes

linea

rst

atis

tics

Abd

ulla

hH

.Al-

nefa

iee

Dur

ham

Uni

vers

ity

2a.

h.al

-nef

aiee

@du

rham

.ac.

ukN

onpa

ram

etri

cpr

edic

tive

infe

renc

efo

rsy

stem

failu

reti

me

Muh

anna

dF.

K.A

l-sa

adon

yU

nive

rsit

yof

Plym

outh

1m

uhan

nad.

alsa

adon

y@pl

ymou

th.a

c.uk

Stoc

hast

icIn

tegr

alan

dA

pplic

atio

nto

Fina

nce

Osv

aldo

Ana

clet

o-Ju

nior

The

Ope

nU

nive

rsit

y1

o.an

acle

to-ju

nior

@op

en.a

c.uk

Tim

eSe

ries

,Bay

esia

nFo

reca

stin

g,Tr

affic

Mod

ellin

g

Isad

ora

Ant

onia

no-V

illal

obos

Uni

vers

ity

ofK

ent

2ia

57@

kent

.ac.

ukBa

yesi

anin

fere

nce

for

Mar

kov

cont

inuo

usre

al-v

alue

dpr

oces

ses

116

Page 117: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Nor

atiq

ahM

ohd

Ari

ffU

nive

rsit

yof

War

wic

k1

strj

ab@

war

wic

k.ac

.uk

Stat

isti

calI

mag

eA

naly

sis

Loui

sJM

Asl

ett

Trin

ity

Col

lege

Dub

lin2

loui

s@m

aths

.tcd.

ieBa

yesi

anin

fere

nce

and

relia

bilit

yth

eory

Nur

iBad

iN

ewca

stle

Uni

vers

ity

1n.

h.ba

di@

new

cast

le.a

c.uk

Gen

eral

ized

linea

rm

odel

s

Kha

ndok

erSh

uvo

Baka

rU

nive

rsit

yof

Sout

ham

pton

2ks

b2g0

8@so

ton.

ac.u

kEn

viro

nmen

tal

Mod

ellin

g,Ba

yesi

anA

naly

sis,

Spat

io-t

empo

ral

Mod

-el

ling.

Ant

onio

Arm

ando

Ort

izBa

rran

onU

nive

rsit

yof

Ken

t2

aao3

3@ke

nt.a

c.uk

Extr

eme

Val

ueTh

eory

Paul

Barr

yTr

init

yC

olle

geD

ublin

1ba

rryp

b@tc

d.ie

Baye

sian

Infe

renc

e

Ale

xBe

rrim

anU

nive

rsit

yof

Live

rpoo

l2

adcb

@liv

erpo

ol.a

c.uk

Epid

emio

logy

Arn

abBh

atta

char

yaTr

init

yC

olle

geD

ublin

2bh

atta

ca@

tcd.

ieBa

yesi

anin

fere

nce

and

sequ

enti

alm

etho

ds

Saky

ajit

Bhat

tach

arya

Uni

vers

ity

Col

lege

Dub

lin2

saky

ajit

.bha

ttac

hary

a@gm

ail.c

omLi

near

mod

els,

dele

tion

diag

nost

ics

117

Page 118: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Ms

Suju

nya

Boon

prad

itU

nive

rsit

yof

Shef

field

1st

p09s

b@sh

effie

ld.a

c.uk

Baye

sian

stat

isti

cs

Phili

ppa

Burd

ett

Uni

vers

ity

ofLe

eds

1m

m08

pmb@

leed

s.ac

.uk

Bioi

nfor

mat

ics

Step

hen

Burg

ess

MR

CC

ambr

idge

2st

ephe

n.bu

rges

s@m

rc-b

su.c

am.a

c.uk

Men

delia

nra

ndom

izat

ion,

Cau

sal

infe

renc

e,Li

tera

ryw

orks

ofL.

N.

Tols

toy

Sim

onBy

rne

Stat

isti

calL

abor

ator

y2

s.by

rne@

stat

slab

.cam

.ac.

ukG

raph

ical

mod

els,

Baye

sian

stat

isti

cs

Alb

erto

Cai

mo

Uni

vers

ity

Col

lege

Dub

lin2

albe

rto.

caim

o@uc

d.ie

Stat

isti

caln

etw

ork

anal

ysis

,MC

MC

met

hods

,Bay

esia

nst

atis

tics

Joe

Cai

ney

Uni

vers

ity

ofBr

isto

l1

joe.

cain

ey@

bris

tol.a

c.uk

Mon

teC

arlo

,MC

MC

,Ada

ptiv

eM

C,S

eque

ntia

lMC

Jona

than

Cai

rns

Uni

vers

ity

ofC

ambr

idge

,Dep

tofO

ncol

ogy

1jm

c200

@ca

m.a

c.uk

Com

puta

tion

alBi

olog

y,M

arko

vC

hain

Mon

teC

arlo

Soha

ilC

hand

Uni

vers

ity

ofN

otti

ngha

m3

pmxs

c1@

nott

ingh

am.a

c.uk

Tim

ese

ries

,Reg

ress

ion

anal

ysis

,Boo

tstr

apm

etho

s

Thom

asD

essa

inD

urha

mU

nive

rsit

y1

t.j.d

essa

in@

durh

am.a

c.uk

App

lied

Prob

abili

ty,d

iscr

ete

and

cont

inuo

usti

me

mar

kov

proc

esse

s

118

Page 119: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Car

aD

oole

yN

UI,

Gal

way

1ca

rado

oley

@gm

ail.c

omSu

rviv

alA

naly

sis,

Frai

lity

Mod

els

Susa

nD

oshi

Uni

vers

ity

ofBa

th2

s.k.

dosh

i@ba

th.a

c.uk

Imag

ean

alys

is,c

one-

beam

CT,

imag

e-gu

ided

radi

othe

rapy

Fadl

alla

Elfa

daly

The

Ope

nU

nive

rsit

y2

f.elf

adal

y@op

en.a

c.uk

Baye

sian

Stat

isti

cs,S

ubje

ctiv

ePr

ior

Elic

itat

ion

Elen

iElia

Uni

vers

ity

ofN

otti

ngha

m1

elen

aelia

3@ho

tmai

l.com

Mod

ellin

gho

spit

alsu

perb

ugs

His

ham

Abd

elH

amid

Elsa

yed

Uni

vers

ity

ofSo

utha

mpt

on3

hish

asta

t@ya

hoo.

com

Surv

ival

Ana

lysi

s

Mar

ina

Evan

gelo

uM

RC

Cam

brid

ge1

mar

ina.

evan

gelo

u@m

rc-b

su.c

am.a

c.uk

Stat

isti

calg

enet

ics

and

Bioi

nfor

mat

ics,

Gen

ome

wid

eas

soci

atio

nan

al-

ysis

and

Path

way

Gen

ome

wid

ean

alys

is

Felic

ity

Kim

Evis

onU

nive

rsit

yof

Dur

ham

1fe

licit

y.ev

ison

@du

rham

.ac.

ukPu

blic

Hea

lth

Stat

isti

cs,M

edic

al,S

tati

stic

alG

enet

ics

Sean

Ewin

gsU

nive

rsit

yof

Sout

ham

pton

2sm

e1v0

7@so

ton.

ac.u

kD

iabe

tes

Chr

isto

pher

Falla

ize

Uni

vers

ity

ofLe

eds

3ch

risf

@m

aths

.leed

s.ac

.uk

Stat

isti

cals

hape

anal

ysis

,str

uctu

ralb

ioin

form

atic

s

119

Page 120: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Ais

haFa

yom

iU

nive

rsit

yof

Not

ting

ham

3La

vend

ers-

love

r@ho

tmai

l.co.

ukM

ulti

vari

ate

Ana

lysi

s

Veri

tyFi

sher

Uni

vers

ity

ofSo

utha

mpt

on1

vaf1

g09@

soto

n.ac

.uk

Expe

rim

enta

lDes

ign

Ash

ley

Parr

yFo

rdU

nive

rsit

yof

War

wic

k2

a.p.

ford

@w

arw

ick.

ac.u

kM

CM

C,E

pide

mic

s

Ann

aFo

wle

rIm

peri

alC

olle

geLo

ndon

1a.

fow

ler0

9@im

peri

al.a

c.uk

Baye

sian

stat

isti

cs;s

tati

stic

alge

neti

cs

Guy

Free

man

Uni

vers

ity

ofW

arw

ick

4g.

free

man

@w

arw

ick.

ac.u

kBa

yesi

anst

atis

tics

,cau

salit

y,gr

aphi

calm

odel

s

Joth

amG

audo

inU

nive

rsit

yof

Sout

ham

pton

1J.P

.K.G

audo

in@

soto

n.ac

.uk

Baye

sian

mod

ellin

gfo

rbi

nary

data

Isab

ella

Gol

lini

Uni

vers

ity

Col

lege

Dub

lin2

isab

ella

.gol

lini@

ucd.

ieM

odel

base

dcl

uste

ring

,Mix

ture

mod

els,

Net

wor

km

odel

s

Flav

ioB

Gon

calv

esU

nive

rsit

yof

War

wic

k3

F.B.

Gon

calv

es@

war

wic

k.ac

.uk

Baye

sian

infe

renc

efo

rdi

ffus

ions

Aim

eeG

ott

Lanc

aste

rU

nive

rsit

y1

a.go

tt@

lanc

aste

r.ac.

ukW

avel

ets

120

Page 121: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Seun

gjin

Han

Uni

vers

ity

ofSh

effie

ld1

s.ha

n@sh

ef.a

c.uk

Baye

sian

,Tim

eSe

ries

,Sta

tist

ical

Arb

itra

ge

Siti

Rah

ayu

Moh

dH

ashi

mTh

eU

nive

rsit

yof

Shef

field

2st

p08s

m@

shef

field

.ac.

ukM

ulti

vari

ate

qual

ity

cont

rol

Siew

Wan

Hee

Uni

vers

ity

ofW

arw

ick

2s.

w.h

ee@

war

wic

k.ac

.uk

Ada

ptiv

eBa

yesi

ande

sign

,Pha

seII

tria

l

Bryo

nyH

illU

nive

rsit

yof

War

wic

k3

b.j.h

ill@

war

wic

k.ac

.uk

Spat

ialS

tati

stic

s,M

CM

Cs,

Stoc

hast

icG

eom

etry

.

Kir

sty

Hin

chlif

fD

urha

mU

nive

rsit

y1

k.m

.hin

chlif

f@gm

ail.c

omIn

fo-G

apD

ecis

ion

Theo

ry

Nat

han

Hun

tley

Dur

ham

Uni

vers

ity

3na

than

.hun

tley

@du

rham

.ac.

ukFo

unda

tion

sof

Dec

isio

nTh

eory

,Im

prec

ise

Prob

abili

ty

Alb

erto

Alv

arez

Igle

sias

Nat

iona

lUni

vers

ity

ofIr

elan

d,G

alw

ay2

a.al

vare

zigl

esia

s1@

nuig

alw

ay.ie

Reg

ress

ion

Tree

s,C

lass

ifica

tion

Tree

san

dSu

rviv

alTr

ees

Am

inJa

mal

zade

hD

urha

mU

nive

rsit

y3

moh

amm

adam

in.ja

mal

zade

h@du

r.ac.

ukD

ata

Min

ing

Tech

niqu

es;B

ayes

ian

Ana

lysi

s;D

ata

Vis

ualiz

atio

n

Joao

Jesu

sU

nive

rsit

yC

olle

geLo

ndon

3jo

ao@

stat

s.uc

l.ac.

ukIn

fere

nce

Wit

hout

Like

lihoo

d

121

Page 122: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Emm

aJo

nes

Uni

vers

ity

ofSh

effie

ld2

stp0

8em

j@sh

ef.a

c.uk

Den

droc

hron

olog

y

Cha

itan

yaJo

shi

Trin

ity

Col

lege

Dub

lin3

josh

ic@

tcd.

ieBa

yesi

anM

odel

ling,

Baye

sian

Infe

renc

efo

rD

iffisi

ons

Proc

esse

s

Oye

bam

ijiO

luw

ole

Keh

inde

Uni

vers

ity

ofSt

rath

clyd

e1

wol

emi2

@ya

hoo.

com

Spat

io-t

empo

ralm

odel

ling

ofai

rpo

lluti

on

Emm

aK

ersh

awU

nive

rsit

yO

fBri

stol

1em

ma.

kers

haw

.08@

bris

tol.a

c.uk

App

lied

Prob

abili

ty,P

opul

atio

nG

enet

ics,

Stat

isti

calG

enet

ics

Mud

akka

rM

nas

Kha

dim

Que

enM

ary

Uni

vers

ity

ofLo

ndon

2m

k@m

aths

.qm

ul.a

c.uk

Des

ign

ofex

peri

men

ts

Md.

Has

inur

Rah

aman

Kha

nU

nive

rsit

yof

War

wic

k2

m.h

.rah

aman

-kha

n@w

arw

ick.

ac.u

kBa

yesi

anSt

atis

tics

,Bio

stat

isti

cs,S

ocia

lSta

tist

ics

Mah

mud

aK

hatu

nU

nive

rsit

yof

Stra

thcl

yde

2m

.kha

tun@

stra

th.a

c.uk

Stat

isti

csin

Imag

ePr

oces

sing

Reb

ecca

Kill

ick

Lanc

aste

rU

nive

rsit

y2

r.kill

ick@

lanc

s.ac

.uk

Wav

elet

s,C

hang

epoi

nts,

Non

-sta

tion

ary

Tim

eSe

ries

Jenn

ifer

Hel

enK

lapp

erU

nive

rsit

yof

Leed

s2

jenn

ifer

@m

aths

.leed

s.ac

.uk

Wav

elet

s,V

ague

lett

e-W

avel

ets,

HPL

C

122

Page 123: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Mar

iaK

onst

anti

nou

Uni

vers

ity

ofSo

utha

mpt

on1

mk2

1g09

@so

ton.

ac.u

kEx

peri

men

talD

esig

n

Kar

olin

aK

rzem

ieni

ewsk

aLa

ncas

ter

Uni

vers

ity

1k.

krze

mie

niew

ska@

lanc

s.ac

.uk

Wav

elet

s

Tom

asz

Lapi

nski

Uni

vers

ity

ofW

arw

ick

1T.

M.L

apin

ski@

war

wic

k.ac

.uk

Prob

abili

tyTh

eory

Mic

hael

Law

ton

MR

CC

ambr

idge

1m

icha

el.la

wto

n@m

rc-b

su.c

am.a

c.uk

Baye

sian

Hie

rarc

hica

lMod

ellin

gan

dSt

ocha

stic

Proc

esse

s

Min

Che

rng

Lee

Uni

vers

ity

ofSo

utha

mpt

on1

mcl

206@

soto

n.ac

.uk

Stat

isti

calD

iscl

osur

eC

ontr

ol

Rui

Xin

Lee

Uni

vers

ity

ofW

arw

ick

1ru

ixin

.lee@

gmai

l.com

finan

cial

mat

hem

atic

s,pr

obab

ility

theo

ry,

mar

kov

proc

esse

s,cr

edit

deri

vati

ves

YeLi

uU

nive

rsit

yof

Lanc

aste

r1

y.liu

10@

lanc

aste

r.ac.

ukEx

trem

eV

alue

Theo

ry

Step

hani

eLl

ewel

ynU

nive

rsit

yof

Shef

field

1s.

llew

elyn

@sh

effie

ld.a

c.uk

Prob

abili

tyan

dSt

atis

tics

Dom

inic

Mag

irr

Lanc

aste

rU

nive

rsit

y1

d.m

agir

r@la

ncas

ter.a

c.uk

Earl

y-ph

ase

clin

ical

tria

ls,a

dapt

ive

desi

gns

123

Page 124: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Col

ette

Mai

rU

nive

rsit

yof

Gla

sgow

2c.

mai

r@st

ats.

gla.

ac.u

kPo

pula

tion

gene

tics

Patr

ice

Mar

ekU

nive

rsit

yof

Wes

tBoh

emia

4pa

trke

@km

a.zc

u.cz

stat

isti

cs

Kie

ran

Mar

tin

Uni

vers

ity

ofSo

utha

mpt

on2

kjm

2v07

@so

ton.

ac.u

kO

ptim

alde

sign

for

non-

linea

rm

odel

s

Bene

dict

Chr

isti

anM

ayU

nive

rsit

yof

Bris

tol

2bm

2668

@br

is.a

c.uk

rein

forc

emen

tlea

rnin

g,m

ulti

-arm

edba

ndit

s,re

gres

sion

Fion

aM

cEld

uff

Inst

itut

eof

Chi

ldH

ealt

h,U

CL

3f.m

celd

uff@

ich.

ucl.a

c.uk

disc

rete

dist

ribu

tion

s

Dan

ielM

iche

lbri

nkTh

eU

nive

rsit

yof

Not

ting

ham

3pm

xdm

@no

ttin

gham

.ac.

ukM

athe

mat

ical

Fina

nce

Gio

rgos

Min

asU

nive

rsit

yof

War

wic

k1

G.C

.Min

asat

war

wik

.ac.

ukA

dapt

ive

anal

ysis

and

desi

gn,f

MR

I

Erin

Mit

chel

lU

nive

rsit

yof

Lanc

aste

r1

e.m

itch

ell@

lanc

aste

r.ac.

ukN

on-S

tati

onar

yTi

me

Seri

esA

naly

sis,

Dyn

amic

Line

arM

odel

s

Joan

neLo

uise

Mof

fatt

The

Uni

vers

ity

ofSa

lfor

d2

j.l.m

offa

tt@

pgr.s

alfo

rd.a

c.uk

Mod

ellin

gte

chni

ques

used

toin

vest

igat

est

rate

gies

team

s/pl

ayer

sca

nap

ply

toin

crea

seth

eir

chan

ces

ofw

inni

ngin

asp

orti

ngco

ntes

t.

124

Page 125: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Nur

Ani

sah

Moh

amed

New

cast

leU

nive

rsit

y1

n.a.

moh

amed

@ne

wca

stle

.ac.

ukO

ptim

alD

ynam

icTr

eatm

entR

egim

es

Rofi

zah

Moh

amm

adU

nive

rsit

yof

Surr

ey2

r.moh

amm

ad@

surr

ey.a

c.uk

Baye

sian

Mod

ellin

gin

mul

tiva

riat

eda

ta

Chr

isto

pher

Nam

Uni

vers

ity

ofW

arw

ick

1c.

f.h.n

am@

war

wic

k.ac

.uk

Hid

den

Mar

kov

Mod

els

Stua

rtN

icho

llsLa

ncas

ter

Uni

vers

ity

3s.

nich

olls

@la

ncas

ter.a

c.uk

Late

ntva

riab

lem

odel

s,bi

oeth

ics,

deci

sion

-mak

ing,

atti

tude

mea

sure

-m

ent

Mit

raN

oosh

aQ

ueen

Mar

yU

nive

rsit

yof

Lond

on2

mno

osha

@ho

tmai

l.com

Dis

cord

ancy

betw

een

prio

ran

dda

tain

Baye

sian

Infe

renc

e

Beth

Nor

ris

Uni

vers

ity

ofK

ent

2bn

40@

kent

.ac.

ukst

atis

tica

leco

logy

Ives

Nta

mbw

eW

arw

ick

Med

ical

Scho

ol2

i.l.n

tam

bwe@

war

wic

k.ac

.uk

Med

ical

Stat

isti

cs

Emm

anue

lOlu

segu

nO

gund

imu

Uni

vers

ity

ofW

arw

ick

1E.

O.O

gund

imu@

war

wic

k.ac

.uk

mis

sing

data

,lon

gitu

dina

lstu

dies

and

surr

ogat

em

arke

rsev

alua

tion

incl

inic

altr

ials

125

Page 126: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Adr

ian

O’H

agan

UC

DD

ublin

3ad

rian

.oha

gan@

hotm

ail.c

o.uk

Mix

ture

Mod

els,

Gen

erat

ive

Dis

crim

inat

ive

Hyb

rids

,Ext

ensi

ons

toth

eEM

Alg

orit

hm

Aid

anO

’Kee

ffe

MR

CC

ambr

idge

2ai

dan.

o’ke

effe

@m

rc-b

su.c

am.a

c.uk

Dyn

amic

caus

alin

fere

nce

and

mul

ti-s

tate

mod

ellin

g

Rac

helO

xlad

eD

urha

mU

nive

rsit

y2

r.h.o

xlad

e@du

rham

.ac.

ukBa

yesi

anst

atis

tics

,Bay

eslin

ear,

unce

rtai

nty

anal

ysis

,com

pute

rsi

mu-

lato

rs,e

mul

atio

n

Ioan

nis

Papa

stat

hopo

ulos

Lanc

aste

rU

nive

rsit

y1

i.pap

asta

thop

oulo

s@la

ncas

ter.a

c.uk

Extr

eme

Val

ueTh

eory

,Bay

esia

nSt

atis

tics

,Tim

eSe

ries

Chr

isto

pher

Pear

ceU

nive

rsit

yof

Live

rpoo

l3

Pear

chrs

9@ao

l.com

Epid

emio

logy

,Sto

chas

tic

mod

els

Duy

Pham

Uni

vers

ity

ofW

arw

ick

2D

uy.P

ham

@w

arw

ick.

ac.u

kPr

obab

ility

,Fin

anci

alM

athe

mat

ics,

Inte

rest

Rat

eM

odel

ling

Mur

ray

Pollo

ckU

nive

rsit

yof

War

wic

k1

mur

ray.

pollo

ck@

war

wic

k.ac

.uk

MC

MC

Bene

dict

Pow

ell

Dur

ham

Uni

vers

ity

1be

nedi

ct.p

owel

l@du

rham

.ac.

ukBa

yesi

anem

ulat

ion,

mul

tiva

riat

esp

atia

lsta

tist

ics

Hel

enPo

wel

lU

nive

rsit

yof

Gla

sgow

2h.

pow

ell@

stat

s.gl

a.ac

.uk

Mod

ellin

gth

eef

fect

sof

air

pollu

tion

126

Page 127: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Den

nis

Pran

gle

Lanc

aste

rU

nive

rsit

y3

d.pr

angl

e@la

ncas

ter.a

c.uk

Baye

sian

Stat

isti

cs,A

BC,I

nfec

tiou

sD

isea

seM

odel

s

Iain

Proc

tor

Uni

vers

ity

ofG

lasg

ow2

ipro

@ce

h.ac

.uk

Stat

isti

cs,E

colo

gy

Noo

razr

inA

bdul

Raj

akN

ewca

stle

Uni

vers

ity

2no

oraz

rin.

abdu

l-ra

jak@

ncl.a

c.uk

Baye

sian

Expe

rim

enta

lDes

ign

Cla

reEm

ilyR

aych

audh

uri

Bris

tolU

nive

rsit

y2

clar

e.e.

mar

tin@

gmai

l.com

Stoc

hast

icdi

ffer

enti

aleq

uati

ons,

num

eric

alin

tegr

atio

n,ch

aoti

csy

s-te

ms

Shiji

eR

enU

nive

rsit

yof

Shef

field

3st

p07s

r@sh

ef.a

c.uk

Baye

sian

Clin

ical

Tria

ls

Jenn

ifer

Rog

ers

Uni

vers

ity

ofW

arw

ick

3J.K

.Rog

ers@

war

wic

k.ac

.uk

Surv

ival

Ana

lysi

s,R

ecur

rent

Even

ts

Vere

naR

olof

fM

RC

Bios

tati

stic

sU

nit

2ve

rena

.rol

off@

mrc

-bsu

.cam

.ac.

ukM

eta-

anal

ysis

Fran

cisc

oR

ubio

Uni

vers

ity

ofW

arw

ick

1F.

J.Rub

io@

war

wic

k.ac

.uk

Baye

sian

Stat

isti

cs,B

iost

atis

tics

Ala

stai

rR

ushw

orth

Uni

vers

ity

ofG

lasg

ow1

alas

tair

@st

ats.

gla.

ac.u

kSp

atia

land

envi

ronm

enta

lmod

ellin

g

127

Page 128: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Fion

aSa

mm

utU

nive

rsit

yof

War

wic

k1

f.sam

mut

@w

arw

ick.

ac.u

kM

ulti

vari

ate

Ana

lysi

s,G

LMs

Ria

Sand

erso

nO

ffice

for

Nat

iona

lSta

tist

ics

NA

ria.

sand

erso

n@on

s.go

v.uk

Sam

ple

desi

gn&

esti

mat

ion,

mod

ellin

gte

chni

ques

Susa

nne

Schm

itz

Trin

ity

Col

lege

Dub

lin1

schm

itzs

@tc

d.ie

Baye

sian

Infe

renc

e

Javi

erSe

rrad

illa

New

cast

leU

nive

rsit

y3

javi

er.s

erra

dilla

@nc

l.ac.

ukG

auss

ian

Proc

esse

s,M

ulti

vari

ate

Stat

isti

cal

Proc

ess

Con

trol

,Fa

ctor

Ana

lysi

s

Gol

naz

Shah

tahm

asse

biU

nive

rsit

yof

Plym

outh

2go

lnaz

.sha

htah

mas

sebi

@pl

ymou

th.a

c.uk

Fina

ncia

lSta

tist

ics,

Baye

sian

mod

ellin

g,C

ompu

tati

onal

Stat

isti

cs

Yiqi

nSh

enU

nive

rsit

yof

War

wic

k1

yiqi

n.sh

en@

war

wic

k.ac

.uk

Brai

nim

agin

gan

alys

is

And

rew

Sim

pkin

Nat

iona

lUni

vers

ity

ofIr

elan

d,G

alw

ay3

a.si

mpk

in1@

nuig

alw

ay.ie

Smoo

thin

gan

dD

eriv

ativ

es

Saw

apor

nSi

ripa

ntha

naU

nive

rsit

yof

Shef

field

1sm

p08s

s@sh

effie

ld.a

c.uk

Stat

isti

calp

roce

ssco

ntro

l

And

rew

Smit

hU

nive

rsit

yof

Bris

tol

3A

ndre

w.D

.Sm

ith@

bris

tol.a

c.uk

Non

para

met

ric

regr

essi

on,I

mag

ean

alys

is

128

Page 129: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Joan

naSm

ith

Uni

vers

ity

ofG

lasg

ow2

j.sm

ith@

stat

s.gl

a.ac

.uk

Shap

ean

alys

is

Mic

helle

Stan

ton

Lanc

aste

rU

nive

rsit

y3

m.s

tant

on@

lanc

aste

r.ac.

ukSp

atia

land

spat

io-t

empo

rale

pide

mio

logy

;tro

pica

ldis

ease

epid

emio

l-og

y

Nat

alie

Stap

linU

nive

rsit

yof

Sout

ham

pton

2nd

s105

@so

ton.

ac.u

kSu

rviv

alA

naly

sis

Kar

aN

icol

aSt

even

sU

nive

rsit

yof

Bris

tol

2ka

ra.s

teve

ns@

bris

tol.a

c.uk

Tim

eSe

ries

Ana

lysi

s

Ale

xand

erSt

raw

brid

geM

RC

Cam

brid

ge2

alex

ande

r.str

awbr

idge

@m

rc-b

su.c

am.a

c.uk

Mea

sure

men

tErr

or

Dav

idSu

daLa

ncas

ter

Uni

vers

ity

3d.

suda

@la

ncs.

ac.u

kSt

ocha

stic

Cal

culu

s,Ba

yesi

anIn

fere

nce,

Com

puta

tion

alSt

atis

tics

Jam

esSw

eene

yTr

init

yC

olle

geD

ublin

3sw

eene

ja@

tcd.

ieSp

atia

lsta

tist

ics,

mul

tidi

men

sion

alin

tegr

atio

n,N

onpa

ram

etri

cre

gres

-si

on

Sara

hTa

ylor

Lanc

aste

rU

nive

rsit

y1

s.ta

ylor

14@

lanc

aste

r.ac.

ukM

odel

ling

ofte

xtur

esus

ing

wav

elet

s

Ale

xand

reTh

iery

Uni

vers

ity

ofW

arw

ick

1a.

h.th

iery

@w

arw

ick.

ac.u

kM

onte

Car

lom

etho

ds-S

tati

stic

alph

ysic

s

129

Page 130: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

How

ard

Thom

MR

CC

ambr

idge

1ho

war

d.th

om@

mrc

-bsu

.cam

.ac.

ukM

odel

Ave

ragi

ng,C

ostE

ffec

tive

ness

Ana

lysi

s

Mar

iaR

Thom

asQ

ueen

Mar

yU

nive

rsit

yLo

ndon

3m

dr@

mat

hs.q

mul

.ac.

ukBa

yesi

anst

atis

tics

and

dose

findi

ngin

clin

ical

tria

ls

Hel

enTh

orne

wel

lU

nive

rist

yof

Surr

ey3

h.th

orne

wel

l@su

rrey

.ac.

ukEx

peri

men

talD

esig

n

Tom

asTo

upal

Uni

vers

ity

ofW

estB

ohem

ia2

ttou

pal@

kma.

zcu.

czst

atis

tics

Mic

hael

Tsag

ris

Uni

vers

ity

ofN

otti

ngha

m1

pmxm

t1@

nott

ingh

am.a

c.uk

Rob

ustS

tati

stic

s

Elen

iVer

ykou

kiU

nive

rsit

yof

Not

ting

ham

2pm

xev@

nott

ingh

am.a

c.uk

Baye

sian

Stat

isti

cs,M

CM

C

Rou

ntin

aV

rous

aiTr

init

yC

olle

geD

ublin

2vr

ousa

ir@

tcd.

ieBa

yesi

anm

etho

dsfo

rsp

atia

l-te

mpo

rala

naly

sis

Jenn

yW

adsw

orth

Lanc

aste

rU

nive

rsit

y2

j.wad

swor

th@

lanc

aste

r.ac.

ukEx

trem

eva

lue

theo

ry;B

ayes

ian

met

hods

,esp

ecia

llyno

npar

amet

eric

s

Eva

Wag

nero

vaU

nive

rsit

yof

Wes

tBoh

emia

1ew

a@km

a.zc

u.cz

stat

isti

cs

130

Page 131: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Nei

lWal

ker

Uni

vers

ity

ofBr

isto

l3

neil.

wal

ker@

fera

.gsi

.gov

.uk

Envi

ronm

enta

lSta

tist

ics

Chu

nW

ang

Uni

vers

ity

ofN

otti

ngha

m3

pmxc

w1@

nott

ingh

am.a

c.uk

Mat

hem

atic

alFi

nanc

e

Kev

inW

ilson

New

cast

leU

nive

rsit

y3

k.j.w

ilson

@nc

l.ac.

ukBa

yesi

anin

fere

nce,

Baye

slin

ear

met

hods

,exp

erim

enta

ldes

ign

Col

inW

orby

Uni

vers

ity

ofN

otti

ngha

m/H

PA1

colin

.wor

by@

hpa.

org.

ukIn

fect

ion

Mod

ellin

g,M

arko

vM

odel

s,M

CM

C

Ala

nW

righ

tU

nive

rsit

yof

Plym

outh

1al

an.w

righ

t3@

plym

outh

.ac.

ukG

enet

icEp

idem

iolo

gy

Yang

Xia

MR

CC

ambr

idge

1ya

ng.x

ia@

mrc

-bsu

.cam

.ac.

ukM

ulti

-sta

tem

odel

ling

and

surv

ival

anal

ysis

Tati

ana

Xif

ara

Lanc

aste

rU

nive

rsit

y1

t.xif

ara@

lanc

aste

r.ac.

ukM

CM

CM

etho

ds,B

ayes

ian

Stat

isti

cs

LeiY

anU

nive

rsit

yof

Not

ting

ham

2pm

xly1

@no

ttin

gham

.ac.

ukIm

age

Ana

lysi

s,St

ocha

stic

Proc

esse

s

Peng

Yin

New

cast

leU

nive

rsit

y1

peng

.yin

@nc

l.ac.

ukst

atis

tics

anal

ysis

ofm

issi

ngda

ta

131

Page 132: 33rd Research Students' Conference in Probability and Statistics

Nam

eIn

stit

utio

nYe

arEm

ail

Res

earc

hIn

tere

sts

Yeun

gW

aiYi

n(W

inni

e)Q

ueen

Mar

y,U

nive

rsit

yof

Lond

on2

w.y

.yeu

ng@

qmul

.ac.

ukBi

ased

Coi

nD

esig

nin

clin

ical

Tria

ls

Ben

Youn

gman

Uni

vers

ity

ofSh

effie

ld3

b.yo

ungm

an@

shef

field

.ac.

ukEx

trem

eva

lue

theo

ry

Nur

Fati

hah

Mat

Yuso

ffN

atio

nalU

nive

rsit

yof

Irel

and,

Gal

way

2n.

mat

yuso

ff1@

nuig

alw

ay.ie

Stru

ctur

alEq

uati

onM

odel

ing,

Cor

resp

onde

nce

Ana

lysi

s,M

easu

re-

men

tErr

or,L

aten

tVar

iabl

e

Vyt

aute

Zab

arsk

aite

Uni

vers

ity

ofN

otti

ngha

m1

pmxv

z@no

ttin

gham

.ac.

ukSt

ocha

stic

proc

esse

sin

Mat

hem

atic

alFi

nanc

e

Piot

rZ

wie

rnik

Uni

vers

ity

ofW

arw

ick

3p.

w.z

wie

rnik

@w

arw

ick.

ac.u

kal

gebr

aic

and

geom

etri

cm

etho

dsin

stat

isti

cs,

mod

elid

enti

fiabi

lity,

asym

ptot

ics

unde

rno

n-re

gula

rsc

enar

ios

132

Page 133: 33rd Research Students' Conference in Probability and Statistics

Best Talks and Poster

Prizes will be awarded to the three best talks and the best poster asvoted for by yourselves, the delegates.

Please use this page to vote for your favorite two talks and your fa-vorite poster and hand it in during lunchtime on Wednesday.

1st best talk:

2nd best talk:

Best poster:

Page 134: 33rd Research Students' Conference in Probability and Statistics

Back of RSC 2010 voting slip

Page 135: 33rd Research Students' Conference in Probability and Statistics
Page 136: 33rd Research Students' Conference in Probability and Statistics

Department of StatisticsUniversity of WarwickCoventryCV4 7ALwww2.warwick.ac.uk/fac/sci/statistics/postgrad/rsc/2010/