Prosper BDA Paper Draft 18December2009(2)-1

Embed Size (px)

Citation preview

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    1/34

    Borrower Decision Aid for People-to-People Lending

    Lauri PuroHelsinki University of Technology

    Department of Industrial Engineering and ManagementP.O. Box 5500, FIN-02015 TKK, FINLAND

    Jeffrey E. TeichNew Mexico State University

    Management DepartmentNew Mexico State University, Las Cruces, NM 88003, USA

    Hannele WalleniusHelsinki University of Technology

    Department of Industrial Engineering and ManagementP.O. Box 5500, FIN-02015 TKK, FINLAND

    Jyrki Wallenius

    Helsinki School of EconomicsDepartment of Business Technology

    POB 1210, Helsinki 00101, [email protected]

    December, 2008(Revised December 2009)

    All rights reserved. This study may not be reproduced in whole or in part without theauthors' permission.

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    2/34

    2

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    3/34

    Abstract

    In setting up, and bidding in online auctions, people face difficult strategic decisions. In this

    study, a Borrower Decision Aid is introduced, which will help formalize the decision making

    process of the sellers, or borrowers in this case, in one particular P2P loan auction site,

    Prosper.com. The vast amount of real-life bidding data available in this online auction

    enables us to build new kinds of tools for decision makers. The Borrower Decision Aid helps

    the borrower to quantify her strategic options, such as starting interest rate, and the amount of

    loan requested. We identify which variables concerning the borrower are related to the

    probability of successfully securing a loan and the final interest rate.

    Keywords:people-to-people lending, decision support, reverse auctions

    3

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    4/34

    Introduction

    1.1 Background

    Prosper.com is the first people-to-people lending marketplace, based on an online reverse

    auction. In this marketplace, people make applications for loans, called listings, and then

    other people make bids on these listings. The winning bidders get to fund the loan and the

    interest rate is determined by the auction the more competition, the lower the interest rate.

    In other words, the idea is to link the person in need of money with people willing to lend

    money without an intermediating bank. Typically a loan is funded with many bidders

    (lenders), because most lenders only fund $50 - $200 per each loan. Lenders bid for these

    small amounts across many loans to help diversify their risk. Prosper.com was launched

    publicly in February 2006, and has brokered so far over $150 million worth of loans [17],

    [22].

    In this study, we focus on the role of the borrower, i.e. the person who sets up the listing for a

    loan. The borrower has several important strategic decisions to make, which can later

    determine if she gets the loan funded or not. The purpose of this study is to provide decision

    support for the borrower when making these important decisions. In the literature there are

    only a few publications that discuss decision support in auctions (in general). [1], [8],

    [15], [16], [23] and [25] are some examples. Their angle is different from ours, though.

    Our study has significant practical importance. Currently, borrowers set their listing

    parameters based on insufficient data, such as the average interest rate. In this study, we

    introduce a framework to analyze the borrowers strategic decisions in terms of success

    4

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    5/34

    probabilities and estimated final interest rates. A Borrower Decision Aid (BDA) is described,

    which enables the borrower to evaluate her strategic options quantitatively. This is a

    significant practical improvement to the current situation where Prosper.com only provides

    scant advice on the starting rate and no advice on the amount of the loan.

    In addition to being practically important, our study is interesting in a theoretical sense as

    well. Namely, the framework and the methods used in constructing the tool are interesting

    and could be used with other online auction sites.

    1.2 Objectives of the Research

    The main objective of this study is to develop a decision support tool for the borrowers. This

    tool helps the borrowers evaluate their strategic options in quantitative terms. In more detail,

    we

    1. identify the most important factors that affect the outcome of the auction, that is

    borrowers chances of getting the loan funded;

    2. identify the most important decision variables that the borrower can change in order

    to influence the outcome of the auction; and

    3. develop a framework and methods to compare different strategic options in

    quantitative terms.

    We look at all the information available and compare it to empirical data on Prosper.com

    listings. The identified factors are then divided into those which the borrower can influence

    and those that are part of the credit report. Both types of variables are needed in this study,

    but naturally the ones that the borrower can influence are the ones we provide advice on. We

    5

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    6/34

    examine different methods of comparing strategic options and choose the best methods and

    variables for the borrower decision aid. The decision support tool is then constructed and

    tested. This study is limited to Prosper.com auctions only. The framework and methods of

    constructing the borrower (or seller) decision aid can, however, be extended to other auction

    sites as well.

    1.3 Data and Research Methods

    This study is based on empirical data provided by Prosper.com. Much of the data is freely

    available on the Prosper.com website. However, accessing the credit records requires one to

    register as a lender on the site. In total there were 312,562 listings made on Prosper.com (up

    to July 2008). Prosper allows access to these listings, which form the basic population data in

    our study.

    The strategic decision making of the borrowers is examined with the help of multivariate

    statistical analyses, in particular ordinary least-squares regression and logistic regression

    analysis. The BDA itself is implemented as a website, which could be made available to the

    public and/or implemented on Prospers own site or a so called third-party site.

    1.4 Organization of the Paper

    The first section of this study provided the background and objectives of this study. In section

    2, literature related to the borrowers strategic decisions is briefly reviewed. In section 3 the

    data used in this study is introduced and the borrowers basic strategic decisions are charted.

    In this section, the most influential decision variables are identified for the development of

    the BDA. Section 4 describes the construction of the BDA and introduces the underlying

    6

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    7/34

    methods. In section 5, the BDA website is described and the two different methods of

    providing support are compared. Section 6 concludes the study.

    Literature Study

    The amount of auction literature is vast. Several of the ground-breaking discoveries were

    made in the 1950s and 1960s when bidding behavior was modeled using a game-theoretic

    framework. The latest wave of research started after the emergence of online auctions. In

    particular, the online environment enabled researchers to carry out empirical studies with data

    gathered from real-life auctions (see, e.g., [3], [18], [25], [26]). This was a clear improvement

    to previous laboratory studies with university students ([10], [11]).

    In the traditional auction literature, the roles of the seller and the auctioneer often coincide,

    although this may not be the case in online auctions. The seller is expected to be able to

    choose the auction design parameters freely. Much of the literature has focused on comparing

    different auction mechanisms in different situations and determining which mechanisms

    provide superior profit for the seller ([19], [20]); or which mechanisms are efficient ([13]). In

    online auctions the roles of the seller and the auctioneer rarely coincide. The auctioneer is the

    website that facilitates the auction. The auctioneer has usually chosen some simple and

    universal auction mechanism that all sellers are obliged to use. Therefore, the strategic

    choices of the seller are constrained by the parameters of the chosen auction mechanism.

    This, however, makes the strategic decisions of the sellers no less important, but actually this

    emphasizes the importance of the few remaining decision variables at the disposal of the

    seller.

    7

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    8/34

    The classic decision variable of the seller is the starting price. The importance of the starting

    price depends heavily on the type of item sold and the auction mechanism. The literature on

    the effect of the starting price on the final price is somewhat controversial. For example, [14]

    found empirical evidence stating that under some conditions having a lower starting price can

    eventually lead to higher final price (in a forward auction). They suggested reduced barriers

    to entry and commitment of bidders as possible reasons. The sunk search and monitoring

    costs make it psychologically difficult for the bidder to walk away from the auction.

    Conversely, [9] showed that the correlation between the starting price and the final price was

    positive. This would mean that by entering a higher starting price, the expected final price is

    higher as well. One reason they suggested was that by entering a higher starting price, the

    seller is able to signal to the bidders that the item is worth at least that much. The higher

    starting price leads to a higher final price also if competition among bidders is generally very

    weak. In an extreme case, with just one interested buyer, the final price will equal the starting

    price, and therefore the higher starting price leads to higher final price. Gilkeson and

    Reynolds [9] agreed with the theory about reduced barriers to entry and commitment, but

    they claimed that it would only affect the probability of the auction to succeed (i.e. the item

    being sold), but not to increase the final rate. They studied eBay auctions that had a

    possibility to use a secret reservation price and they proved that higher starting price leads to

    lower success probability, but a higher closing price. In [18] the correlation between starting

    price and final price was also found to be positive.

    The tradeoff suggested by [9] provides an excellent framework for our study. The borrower

    has two aims on Prosper.com. First, she wants to get her loan funded in a successful auction

    event. Second, she wants the interest rate to be as low as possible. The tradeoff makes this

    8

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    9/34

    decision difficult. If the borrower wants to be sure that the loan gets funded, she must settle

    for a higher interest rate and a higher starting rate. But a lower starting rate may result in a

    lower final rate but at a reduced probability of funding.

    Prosper.com is a multi-unit auction, because the loan will normally be funded by multiple

    bidders. This makes the situation even more interesting. In multi-unit auctions, the borrower

    can choose the number of items (i.e. loan amount) sold in addition to the starting price. This

    decision has similar kind of strategic value as the starting price. Most of the previous studies

    on multi-unit auctions, however, look at the amount as a question about an optimal lot size in

    repeated auctions ([2], see also [24]). On Prosper.com, however, the same borrower creates

    only one or at most a couple of listings, and therefore the importance of the amount as a

    strategic decision variable is further emphasized.

    Strategic Decision Making of the Borrowers

    This section begins by introducing the data used in this study, including an explanation of

    how the Prosper.com auction site operates from the borrowers viewpoint. Next, we identify

    the most important decision and credit variables that affect the outcome of the listing. These

    variables will later form the heart of the Borrower Decision Aid.

    1.5 An Example from Prosper.com Website

    An example of a loan listing from Prosper.com is shown in Figure 1. Here the borrower is

    seeking a $7,300 loan to expand a small business. There is still over 37 hours left in the

    auction and the loan is already fully funded. The starting rate of the listing was 25.96% and it

    has been bid down to 13%. The borrowers credit grade is C and he is a verified homeowner.

    The debt-to-income ratio of the borrower is 39%. The listing includes a short description,

    9

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    10/34

    where the borrower usually explains how she is planning to use the money. Additionally, the

    listing may include a picture and endorsements from friends and family.

    FIGURE 1 ABOUT HERE

    3.2 Description of the Data Used in the Study

    Prosper.com offers a unique opportunity to access a vast amount of real-life online auction

    data ([22]). Prosper.com has been in operation since February 2006 and during this time until

    August 2008 there have been 312,562 listings created. Our data set starts from May 2006 and

    includes all non-active listings made before August 2008. The number of listings in our

    sample is 293,976. From these listings 26,251 (8.4%) have been funded. This data will be

    used as the basic population. The data is freely available on Prosper.com website, although

    for additional credit information one must register as a lender. The data was then imported to

    MS Access, where it was further analyzed.

    Large amounts of data are available for every listing including what is available to registered

    lenders from the credit report. In an appendix, all the factors have been listed. The most

    important factors are amount requested, starting rate, credit grade, debt-to-income ratio,

    duration, funding option, homeownership, and status and end date. In addition, other

    demographic information is available. The credit information is pulled from Experian Scorex

    Plus (SM) system, which is specialized in providing peoples credit information.

    The terms of the loans on Prosper.com are fixed to a three year, fully amortized, unsecured

    loan. The borrower can choose the loan amount freely between $1,000 and $25,000. The

    average requested loan amount is $7,500 (median $5,000). The starting rate of the auction

    can be set anywhere between 0% and 36%. The average starting rate is 19% (median 18%).

    The starting rate naturally is very sensitive to the credit grade of the borrower. The credit

    10

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    11/34

    grade (scaled between AA-HR, where AA is best and HR worst) is calculated by Experian. It

    takes into account all the credit information variables and grades the borrower accordingly.

    Debt-to-income ratio is calculated by dividing the borrowers total amount of debt with her

    income. The ratio is limited to between 0-101%, but because the income is self-reported, this

    statistic could be inaccurate.

    The borrower can choose the duration of the auction from four alternatives: 3, 5, 7 and 10

    days. 7 days is the most commonly used auction duration. The borrower has the option to end

    the auction as soon as the loan gets funded. This practically means that the borrower is

    satisfied with the starting rate and needs the money as quickly as possible. Otherwise the

    auction will be open for the full duration. The status of the listing can be completed, expired,

    withdrawn or cancelled. The completed status means that the listing was funded and the loan

    issued. The expired status means that the auction ran its full duration but never got funded.

    The withdrawn status means that at some point during the auction the borrower withdrew the

    listing. The cancelled status means that Prosper.com has cancelled the auction because of

    some faulty listing information. Furthermore, the listing can be active, which means that it is

    currently running. In this study the active listings have been excluded.

    3.3 Identifying Influential Variables

    In this section we identify the most important variables that affect the success probability of

    the listing by using pair wise correlation tests. In order to perform the tests some of the

    variables had to be transformed. The most important transformation was done to the Status

    variable. This was transformed into a dummy variable. The status completed was entered as

    1 and expired, withdrawn and cancelled as 0. We do not know the reasons behind the

    borrowers withdrawal decision. However, the number of withdrawn listings is very

    11

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    12/34

    significant, up to 30% of all listings. Therefore, we can not exclude them all. It seems that a

    great majority of these listings has been withdrawn because of lack of interest by the bidders.

    Only 1% of all withdrawn listings were fully funded. Apparently, some people do not want to

    see their listing expire, if the bidders show only little interest in it. They would rather

    withdraw it and create a new listing with different listing parameters. In total, there were

    140,265 borrowers who made listings on Prosper.com. Of these people, up to 67,297 (48%)

    had multiple listings. Most of the people who made multiple listings had done so because

    their first listing did not get funded. Up to 77% of the people who had made multiple listings

    had at least one unsuccessful listing. This further underscores the point that the borrowers

    could really use a strategic decision support tool, which would help them in finding the right

    listing parameters the first time.

    For this part of the study, the credit grade scale AA-HR was transformed to numbers between

    1 and 7. This is an unorthodox way of describing the credit grade, because the credit grade is

    ordinal scaled, not interval scaled as 1-7 would suggest. Later, this problem has been solved

    by calculating each credit grade individually, but this scaling allows easy preliminary

    examination. The funding option was transformed into a dummy variable so that Open for

    duration was entered as 1 and Close when funded as 0. The same approach was used with

    the variable homeownership.

    In Table 1, the correlations between different listing variables and the status dummy

    variable is presented. The pair wise correlation test was performed with all reasonable

    variables. Some credit information variables were omitted if they suffered from a small data

    sample or if they were too much alike other variables. The variables that the borrower can

    have an influence on have been presented in the first four rows of the table. The rest of the

    12

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    13/34

    rows are credit information variables, which the borrower can not influence at least in the

    short-run.

    As a whole, the correlations are relatively small. There are a few logical reasons for this. First

    of all, we have used all the data available. This enables us to see the big picture, but for

    example the starting rate is very sensitive to the credit grade. For example, a 15% starting

    rate might guarantee the success of the listing for an AA grade borrower, but the same

    starting rate might be too low for an HR grade borrower to get her listing funded. Therefore,

    in the full data set the correlations are lower than when examined one credit grade at a time.

    Again particularly the starting rate, i.e. the price of the loan, is very sensitive to common

    market interest rates and risk premiums. We have used data from the full two and a half years

    of time. During this time the federal interest rate has varied between 2-5.25% (Federal

    Reserve [7]). In addition, the recent credit crisis has increased the risk premiums

    substantially. Therefore, the correlations would be higher if we would look at data from

    shorter periods of time, where the market fundamentals would be similar for all listings.

    The correlation analysis done with the full data set does enable us to compare the significance

    of different variables. As we can see, the credit information variables have generally higher

    correlations than the decision variables. This is quite logical, as people with low credit grades

    have difficulties in obtaining a loan no matter how high, for example, the starting rate is. All

    the correlations are statistically significant, because of the high number of observations. The

    amount requested and starting rate have higher correlations than the funding option and

    the duration. The signs of these correlations are in line with [9]. A higher starting rate

    increases the borrowers chances of getting the loan funded (note that Prosper.com auction

    mechanism is reversed in the sense that high interest rate is bad for the borrower, i.e. the

    13

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    14/34

    seller, and good for bidders). Logically, a higher amount requested decreases the borrowers

    chances of having a successful listing. The funding option Open for duration, entered as 1

    increases the borrowers success probability, as is the case with the longer duration.

    Next to the credit grade, the delinquency related variables have the second highest

    correlation. The current delinquencies seem to be the most influential of these variables.

    The homeownership shows some correlation and the correlation of debt-to-income ratio is

    relatively low. This is the case with the variable income as well. All the signs of the

    variables are logical.

    TABLE 1 ABOUT HERE

    In Table 2 the correlation analysis is repeated for two different credit grades: A and D. This

    demonstrates how different the two credit grades are from each other. Now that the aggregate

    credit information variable is already taken into account in the data sampling itself, we can

    see that the decision variables become much more important. The correlations of the

    amount and the starting rate are now relatively high. The correlations of the funding

    option and the duration seem to remain at a low level. Note that the number of

    observations in Table 1 is different across attributes. For example, the Debt to Income

    attribute was not calculated in cases where income was not reported, or it was zero.

    Now that the samples already contain information about the credit grade, the correlations of

    the rest of the credit information variables are significantly lower. It would seem that the

    debt-to-income ratio would have the highest correlation with success of the listing within a

    credit grade. Homeownership seems to be problematic. In credit grade A there is no

    correlation at all. In credit grade D the correlation is negative, which is counter-intuitive.

    Owning a house seems to negatively affect the borrowers chances of getting the loan funded.

    14

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    15/34

    Because of this inconsistency, this variable was omitted1. Apparently, the delinquency related

    credit information variables seem to have some correlation with the success of the listing, but

    this time the current delinquencies are more heavily emphasized. The amount delinquent

    and delinquencies last 7 years seem to underperform the current delinquencies throughout

    the data. The last credit information variable, the income, behaves very inconsistently. In

    credit grade A the correlation is very small but positive. In credit grade D the correlation is

    negative, which is again counter-intuitive. It could be that people expect borrowers with high

    income to manage their financial situation better than credit grade D implies. The correlation

    of the income variable is relatively low, perhaps because the income is self-reported and

    possibly inaccurate.

    TABLE 2 ABOUT HERE

    Based on this analysis the most influential variables are the starting rate, amount

    requested, credit grade, debt-to-income ratio and current delinquencies2. The

    development of the borrower decision aid was designed with this set of variables.

    Borrower Decision Aid Underlying Model

    In this section we develop the Borrower Decision Aid using an underlying logistic regression

    model and a query method, a brute force categorization of the database. They are alternative

    models which provide information about the probability of the listing being funded given

    initial parameter settings such as starting rate, loan amount, credit rating etc. The final rate

    1 A reason might be that the individuals were never questioned about the size of their mortgage.

    2 For further validation the correlations were rerun for two separate subsamples (July-December/2007

    and January-June/2008). Our results showed that the same variables maintained their high correlations

    with listing success over time as well, justifying our choice of variables for the model.

    15

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    16/34

    given these parameters is also predicted via an OLS regression model. In general, the

    predictions are based on the most recent six months of data, instead of the whole 2.5 years

    available because the market for loans has substantially changed during that time period. The

    Federal Funds rate has varied between 2-5.25% during the 2.5 years (Federal Reserve [7]).

    Also the market risk premiums have fluctuated. For example the credit crisis that started in

    spring 2007 has increased the risk premiums significantly. In Figure 2 we can see how the

    interest rates have fluctuated on Prosper.com during the previous year for the various credit

    grades.

    FIGURE 2 ABOUT HERE

    1.6 Logistic Regression Model

    The Borrower Decision Aid (BDA) estimates the probability of getting the loan funded,

    given planned listing parameters and borrowers credit information. We opted to use the

    logistic regression model because normal regression does not allow a dependent variable to

    be binary (listing getting funded or not). Gilkeson and Reynolds [9] used logistic regression

    in their study to examine how the starting rate affected auction success. The logistic

    regression is based on the cumulative logistic probability function described below (see, e.g.,

    [21]).

    where f(z) represents the probability of funding given the set of independent variables defined

    as follows:

    otherwise the logistic regression works in similar manner as the ordinary regression model.

    16

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    17/34

    The logistic regression was calculated separately for all credit grades. The reason for this was

    that borrowers behave very differently between the credit grades as we saw in Table 2. Thus,

    by calculating the credit grades separately we can get more accurate results that match the

    current credit grade specifically. Another reason for this is that the credit grade itself is an

    important factor in determining whether the listing gets funded or not. However, adding it to

    the model as a variable is difficult because it is ordinal scaled, not interval scaled.

    The independent variables: starting rate, amount, debt-to-income ratio and current

    delinquencies were chosen according to the analysis in section 3.3. In Table 3 the

    coefficients of the logistic regression models have been displayed. They all have logical

    signs. Increasing the starting rate increases ones chances of getting the loan funded.

    Conversely, increasing the requested amount, debt-to-income ratio or current

    delinquencies decreases ones chances of getting the loan funded. All the independent

    variables are statistically significant. In logistic regression there is no equivalent measure for

    the coefficient of determination R2. Instead we use McFaddens Pseudo-R2.

    TABLE 3 ABOUT HERE

    As we can see from Table 4, as we introduce new variables one after another, starting with

    amount, followed by starting rate, debt-to-income ratio, and current delinquencies, the

    Pseudo-R2 increases to 0.208, but adding additional independent variables does not improve

    17

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    18/34

    the model significantly3. We use the query method, developed below, to cross validate the

    model.

    TABLE 4 ABOUT HERE

    1.7 Query Method

    The query method is an intuitive data-driven way of determining the success probability.

    Based on the listing parameters entered by the borrower, the database is searched for similar

    listings within a certain range (+/-25%) of the value of the parameter. This ensures that the

    query returns an adequate number of similar kinds of listings. The current delinquencies

    were queried as a binary true/false variable. Then the BDA calculates the success ratio of this

    sample, i.e. how many listings got funded.

    The logistic regression can calculate the precise success probability with exact listing

    parameters, whereas the query method takes a set of listings which have relatively similar

    listing parameters. The query method requires vast amounts of data (i.e., listings) to work

    properly. When there are not enough similar listings, the reliability of the query method

    quickly decreases. This problem arises when the given listing parameters are less frequently

    used. As stated previously, all the methods use 6 months of data. For the query method,

    however, the data sample is extended to 12 months if the number of similar listings is below

    20. The user of the BDA will naturally be alerted when the sample period is increased to 12

    months.

    3 For further validation the logistic regression model was recalculated with out-of-sample data. Results

    using data from July-December 2007 showed that the same variables remained significant in the model

    and no additional variables that would have improved the model significantly were found in the 7-

    12/2007 sample either. In other words, the model structure remains consistent over time, although the

    individual regression coefficients require regular updating for current market conditions.

    18

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    19/34

    1.8 Regression Model for the Final Rate

    The final rate of the listing was estimated with an ordinary regression model. In Table 5 the

    correlations between the regression variables have been calculated for credit grade A. The

    correlation was highest between the final rate and the starting rate. The correlation

    between the final rate and the amount is also high. The debt-to-income ratio and current

    delinquencies have lower correlations with the final rate, but they are still statistically

    significant. There is a risk of multicollinearity in the model, because the correlation between

    the independent variables starting rate and amount is 0.55. Both variables are vital for the

    model and therefore neither one was omitted.

    TABLE 5 ABOUT HERE

    The regression model is similar to the logistic regression model, but this time the dependent

    variable is the final rate. The final rate is a continuous variable and therefore an OLS

    regression model is appropriate.

    The results of the regression model, presented separately for different credit grades, can be

    seen in Table 6. Each regression coefficient is presented with the corresponding p-value

    associated with the t-test. Almost all of the variables are statistically significant with 5%

    significance level. There are two exceptions: the variable amount in credit grade C and the

    variable debt-to-income ratio in credit grade HR.

    The R2 is between 0.5-0.7. The number of observations is quite equally distributed among the

    credit grades. The number of observations is much smaller than in the logistic regression,

    because here we can use only completed listings. In the last column of Table 6, the final rate

    19

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    20/34

    estimates have been calculated with the following listing parameters: starting rate 18%,

    amount $5,000, debt-to-income ratio 40% and zero current delinquencies. As we can

    see, the final rate estimate quickly increases as the credit grade becomes worse. As a whole

    the regression model predicts the final rate reasonably well. Looking at the associated

    residual plots, it appears there is unequal (increasing) variance associated with some of the

    variables implying heteroskedasticity, in particular when outside the range of common

    values. However because we are not calculating prediction intervals in the BDA (point

    estimates are still unbiased), this appears less serious.

    TABLE 6 ABOUT HERE

    4.4 Comparison of Query and Logistic Regression Methods

    As a brief cross validation between the query method and the logistic regression method, we

    have produced Figures 3 and 4. In Figure 3 the BDA was run with listing parameters: credit

    grade A, debt-to-income ratio 40%, zero current delinquencies and $5,000 requested

    amount. The starting rate was increased from 1% up to 30%. In general, the two methods

    provide similar results. Within reasonable starting rates between 9-17% the results are very

    similar. For low starting rates (below 7%), the success rate estimates are unreliable for both

    methods, and probably actually are close to zero. For higher starting rates (above 17%), the

    actual success rates are probably in between the results produced by the two methods.

    FIGURE 3 ABOUT HERE

    In Figure 4, the analysis is repeated, but this time the starting rate was fixed to 15% and the

    amount was changed from $1,000 to $25,000. Again, the results of both methods are very

    similar, the query method providing an upper bound and the logistic regression a lower bound

    for the success rate. The query method has some fluctuations when the number of

    20

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    21/34

    observations is very small. This is just one cross-section of the data; however on average the

    two methods should produce similar results.

    FIGURE 4 ABOUT HERE

    Borrower Decision Aid Website

    The BDA was implemented as a website. On the website the borrower enters the blank fields

    in the borrower information window as seen in Figure 5. Then they click Estimate and

    the results for an example are presented on their screens below. First, the tool prints the

    listing parameters that the borrower entered and shows the search criteria for the query

    method. Then a sensitivity table is presented, where the borrower can see how the estimated

    final rate and estimated success probability change with different requested amounts and

    starting rates. In this case, the estimated final rate with the given parameters was 11.65% and

    the success probability 0.46. These figures were calculated based on the regression models.

    By increasing the starting rate by 1%, the borrower can increase her chances of getting the

    loan funded to 0.50 but the final rate increases to 12.15%. By decreasing the requested

    amount, however, the borrower can increase the success probability to 0.50 and decrease

    the final rate to 11.50%. In the table, some other combinations have also been calculated

    allowing the borrower to decide the best option, or, repeat and recalculate.

    Below the sensitivity table, there are more detailed results. First, we can see the search

    criteria used by the query method. Then there is the final rate estimate with coefficient of

    determination and number of observations. Next, the tool calculates the estimated final rate

    according to the regression model as presented in section 1.8. In this case the estimate is

    11.63%, which is significantly below the starting rate of 15%. Below the estimated final rate,

    21

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    22/34

    the tool presents the R2, and number of observations used in constructing the model. These

    basic regression diagnostics give some indication of the reliability of the results.

    The next analysis is the logistic regression. Here the BDA calculates the estimated success

    probability of the listing. In this case the probability of the listing getting funded is 46%.

    Again, the regression diagnostics, i.e. the Pseudo-R2 and the number of observations used in

    constructing the particular logistic regression model, have been attached to the regression

    results. The final analysis is the query method. Here the BDA queries the database for similar

    listings as the one entered by the borrower. The search criteria were shown in the very

    beginning of the results box. In this case there were in total 51 similar listings of which 24

    were funded. The implied success probability is 47%, which is very close to the one given by

    the logistic regression method. Finally, the 51 similar listings have been printed individually

    (only the first four are shown in the figure). The borrower can then manually check what kind

    of listings people have made previously and the result.

    FIGURE 5 ABOUT HERE

    Conclusions

    The main objective of this study was to develop a decision support tool for the borrowers in a

    P2P reverse auction lending environment. The underlying tool is based on regression models

    and data driven query methods. The tool enables the borrowers to evaluate their strategic

    options in quantitative terms. We found that there is a trade-off between having a low final

    rate and getting the loan funded, as the previous literature had suggested. In order to have a

    low final rate, the borrower must choose a lower starting rate. This, however, decreases the

    borrowers chances of getting the loan funded. Therefore, the borrower must consider these

    two factors and make the difficult tradeoff decision about the targeted final rate and the

    22

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    23/34

    acceptable risk in terms of success probability. The Borrower Decision Aid (BDA) quantifies

    this decision by calculating the estimated success probability and the estimated final rate. In

    addition, the borrower can fine-tune both the success probability and the estimated final rate

    by changing the loan amount. By requesting a smaller loan amount the success probability

    increases and the final rate decreases. If the borrower is not able to find a satisfying starting

    rate that would have acceptable success probability combined with suitable final rate, she

    must decrease the loan amount.

    The BDA assists the borrower to see the listing parameters in a strategic context and provides

    useful quantitative information to support the final decision. As such the BDA naturally

    works only with P2P lending sites similar to Prosper.com. However, the methods used in

    constructing the tool could be used in other contexts as well. Firstly, the BDA could be

    extended to other online auctions with sufficient data available. The auctions would not have

    to be multi-unit, but the success probability could be attached, for example to exceeding the

    secret reservation price (used, for example, on eBay; see [12]). The final rate estimate is

    naturally even more widely applicable. A reliable estimate of the final price would be useful

    for the seller in any auction. The availability of data is probably the biggest constraint in

    expanding the use of the BDA.

    One of the benefits for auction sites to provide access to their raw auction data is the possible

    development of third party sites providing tools to the general public. Prosper has already

    benefited from this aspect of their philosophy and will continue to do so. We have provided

    our BDA to Prosper, and they are considering implementing our tool on their site.

    23

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    24/34

    Acknowledgement: This research was supported by the Academy of Finland grant

    number #121980.

    References

    [1] G. Adomavicius and A. Gupta, Toward Comprehensive Real-time Bidder Support

    in Iterative Combinatorial Auctions, Inf Syst Research 16, 2005, 169185.

    [2] C. Beam, A. Segev and J.G. Shanthikumar, Electronic Negotiation through Internet-based

    Auctions, CITM Working Paper 96-WP-1019, 1996.

    [3] P. Bajari and A. Hortasu, The Winners Curse, Reserve Prices, and Endogeneous Entry:

    Empirical Insights from eBay Auctions, RAND Journal of Economics34 (2), 2003,

    329-355.

    [4] Borrower Decision Aid, 2008, http://www.tikkunekut.org/prosper/index.php/, viewed

    September 10th 2008.

    [5] Erics Credit Community website, Prosper Loan Growth Last 24 Months,

    http://www.ericscc.com/stats/prosper-loan-growth/, viewed September 5 th 2008.

    [6] Erics Credit Community website, Lender Interest Rate History,

    http://www.ericscc.com/stats/interest-rate-history/, viewed September 5 th 2008.

    [7] Federal Reserve, Open Market Operations, http://www.federalreserve.gov/fomc/

    fundsrate.htm, viewed September 5th 2008.

    [8] J. Gallien and L.M. Wein, A Smart Market for Industrial Procurement with

    Capacity Constraints, Management Science 51, 2005, 7691.

    [9] J.H. Gilkeson and K. Reynolds, Determinants of Internet Auction Success and Closing

    Price: An Exploratory Study, Psychology & Marketing 20 (6), 2004, 537-566.

    [10] R.M. Harstad, Dominant Strategy Adoption and Bidders Experience with Pricing Rules,

    Experimental Economics 3, 2000, 261-280.

    24

    http://www.federalreserve.gov/fomc/%20fundsrate.htmhttp://www.federalreserve.gov/fomc/%20fundsrate.htmhttp://www.federalreserve.gov/fomc/%20fundsrate.htmhttp://www.federalreserve.gov/fomc/%20fundsrate.htmhttp://www.federalreserve.gov/fomc/%20fundsrate.htm
  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    25/34

    [11] J.H. Kagel, R.M. Harstad, and D. Levin, Information Impact and Allocation Rules in

    Auctions with Affiliated Private Values: A Laboratory Study, Econometrica 55 (6),

    1987, 1275-1304.

    [12] R. Katkar and D. Reiley, Public versus Secret Reserves in Auctions: Results from a

    Pokemon Field Experiment, NBER Working Paper No. W8183, 2001, Available at

    SSRN: http://ssrn.com/abstract=264437.

    [13] R.J. Kauffman, T.J. Spaulding, and C.A. Wood, Are Online Auction Markets Efficient?

    An Empirical Study of Market Liquidity and Abnormal Returns, Decision Support

    Systems48 (1), 2009, 3-13.

    [14] G. Ku, A.D. Galinsky and J.K. Murnighan, Starting Low but Ending High: A Reversal

    of the Anchoring Effect in Auctions, Journal of Personality and Social Psychology

    90 (6), 2006, 975-986.

    [15] A.M. Kwasnica, J.O. Ledyard, D. Porter and C. DeMartini, A New and Improved

    Design for Multiobjective Iterative Auctions, Management Science 51, 2005, 419

    434.

    [16] R. Leskel, J. Teich, H. Wallenius and J. Wallenius, Decision Support for Multi-Unit

    Combinatorial Bundle Auctions, Decision Support Systems 43 (2), 2007, 420-434.

    [17] A. Lin and J. Teich, P2P Lending A Credit Revolution?, New Mexico Business

    Outlook, NMSU College of Business, July 2006.

    [18] D. Lucking-Reiley, D. Bryan, N. Prasad, and D. Reeves, Pennies from eBay: The

    Determinants of Price in Online Auctions, Journal of Industrial Economics 55 (2),

    2007, 223-233.

    [19] R.P. McAfee and J. McMillan, Auctions and Bidding, Journal of Economic Literature25

    (2), 1987, 699-738.

    25

    http://ssrn.com/abstract=264437http://www.sciencedirect.com/science/journal/01679236http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%235878%232007%23999569997%23644001%23FLA%23&_cdi=5878&_pubType=J&view=c&_auth=y&_acct=C000040838&_version=1&_urlVersion=0&_userid=736613&md5=700c518901bb8e20b0cf82f1a35f730ehttp://ssrn.com/abstract=264437http://www.sciencedirect.com/science/journal/01679236http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%235878%232007%23999569997%23644001%23FLA%23&_cdi=5878&_pubType=J&view=c&_auth=y&_acct=C000040838&_version=1&_urlVersion=0&_userid=736613&md5=700c518901bb8e20b0cf82f1a35f730e
  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    26/34

    [20] P.R. Milgrom R.J. Weber, A Theory of Auctions and Competitive Bidding,

    Econometrica 50 (5), 1982, 1098-1122.

    [21] R.S. Pindyck and D.L. Rubinfeld, Econometric Models and Economic Forecasts

    (McGraw-Hill, Boston, 1997).

    [22] Prosper.com, Listing Summary for Listing Number #385755, 2008, http://www.

    prosper.com/lend/listing.aspx?listingID=385755/, viewed September 5th 2008.

    [23] T. Sueyoshi and G.R. Tadiparthi, An Agent-Based Decision Support System for

    Wholesale Electricity Markets, Decision Support Systems 44 (2), 2008, 425-446.

    [24] J.E. Teich, H. Wallenius, J. Wallenius, and A. Zaitsev, A Multi-Attribute e-Auction

    Mechanism for Procurement: Theoretical Foundations, EJOR175 (1), 2006, 90-100.

    [25] D. Van Heijst, R. Potharst and M. Van Wezel, A Support System for Predicting eBay

    End Prices, Decision Support Systems 44 (4), 2008, 970-982.

    [26] J. Zhang, The Roles of Players and Reputation: Evidence from eBay Online Auctions,

    Decision Support Systems 42 (3), 2006, 1800-1818.

    26

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    27/34

    Appendix: Listing Factors and Additional Credit Information

    Listing Factor Description

    Amount Funded The sum of bid amounts or requested amountif fully funded

    Amount Remaining The amount still remaining unfundedAmount Requested The amount requested in the listing

    Bid Count The number of bids on this listing

    Borrower City The home city of the borrower

    Borrower Starting Rate The starting rate of the listing

    Borrower State The home state of the borrower

    Category One of the following:Not availableDebt consolidationHome improvementBusiness loanPersonal loanStudent loanAuto loanOther

    Creation Date The date the listing was created

    Credit Grade The credit grade of the borrower AA-HR

    Debt-to-Income Ratio The debt-to-income ratio of the borrower

    Description The description about the listing written bythe borrower

    Duration The duration of the listing

    End Date The date when the listing ends

    Funding Option One of the following:

    Open for durationClose when funded

    Group Key The identifier code of the group in which theborrower is a member of

    Is Borrower Homeowner Specifies if the borrower is a verifiedhomeowner

    Key The identifier code of the listing

    Lender Rate The final interest rate of the listing

    Member Key The identifier code of the borrower

    Start Date The starting time of the listing

    Status One of the following:Active

    WithdrawnExpiredCompletedCancelledPending Verification

    Title The title of the listing

    27

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    28/34

    Additional Credit Information Description

    Amount Delinquent The amount delinquent at the time the listingwas created

    Bankcard Utilization The percentage of available revolving credit

    that is utilized at the time the listing wascreated

    Borrower Occupation The occupation of the borrower

    Current Credit Lines The number of credit lines

    Current Delinquencies The number of current delinquencies

    Date Pulled The date when the credit information waspulled

    Delinquencies Last 7 Years The number of delinquencies in the last 7years

    Employment Status The employment status of the borrower

    First Recorded Line of Credit The date of the first recorded credit line of the borrower

    Income The annual income range of the borrower 0 Not displayed1 $0 or unable to verify2 - $1 24,9993 - $25,000 49,9994 - $50,000 74,9995 - $75,000 99,9996 - $100,000+7 Not employed

    Inquiries Last 6 Months The number of inquiries in the last 6 months

    Length Status Months The length of the employment status inmonths

    Open Credit Lines The number of open credit lines

    Public Records Last 10 Years The number of public records in the last 10years

    Public Records Last 12 months The number of public records in the last 12months

    Revolving Credit Balance Amount of revolving credit balance

    Total Credit Lines The number of total credit lines

    28

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    29/34

    Figure 1 Screen image from one listing on Prosper.com (Prosper.com, Used with

    Permission, 2008)

    29

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    30/34

    Figure 2 Interest rates by credit grades on Prosper.com (Eric's credit community,

    2008b)

    30

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    31/34

    Query Method vs. Logistic Regression Method (Starting Rate)

    0

    0,1

    0,2

    0,3

    0,4

    0,5

    0,6

    0,7

    0,8

    0,9

    1

    0 % 5 % 10 % 15 % 20 % 25 % 30 %

    Starting Rate

    SuccessRate

    0

    10

    20

    30

    40

    50

    60

    70

    No

    ofListings

    No of Listings

    Query

    Logistic Regression

    Figure 3 Query method compared against logistic regression method

    Query Method vs. Logistic Regression Method (Amount)

    0

    0,1

    0,2

    0,3

    0,4

    0,5

    0,6

    0,7

    0,8

    0,9

    1

    0 5000 10000 15000 20000 25000

    Amount

    SuccessRate

    0

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    No

    ofListings

    No of Listings

    Query

    Logistic regression

    Figure 4 Query method compared against logistic regression method

    31

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    32/34

    Figure 5 Example Screen Image of Borrower Decision Aid (Borrower Decision Aid,

    2008)

    32

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    33/34

    Table 1 Pairwise correlation tests between success of the listing and listing

    characteristics

    Correlation P-Value Observations

    Amount Requested -0.06 0.0000 293976Starting Rate 0.06 0.0000 293976Funding Option 0.03 0.0000 293976

    Duration 0.02 0.0000 293976

    Credit Grade* -0.28 0.0000 293976

    Debt-to-Income Ratio -0.04 0.0000 274246Homeownership 0.07 0.0000 293976

    Current Delinquencies -0.14 0.0000 291732Delinquencies last 7 years -0.10 0.0000 291732

    Amount Delinquent -0.06 0.0000 220252Income 0.03 0.0000 293976

    * Credit grade AA-HR transformed into interval scale of 1-7

    Table 2 Pairwise correlation between success of the listing and listing characteristics in

    credit grades A and D

    Credit Grade A Credit Grade D

    Correlation P-

    Value

    Obs. Correlation P-

    Value

    Obs.

    Amount Requested -0.24 0.0000 11073 -0.18 0.0000 43234Starting Rate 0.11 0.0000 11073 0.17 0.0000 43234

    Funding Option -0.02 0.0441 11073 -0.04 0.0000 43234Duration 0.03 0.0006 11073 0.02 0.0003 43234

    Debt-to-Income Ratio -0.09 0.0000 9501 -0.05 0.0000 40033Homeownership 0.00 0.9947 11073 -0.03 0.0000 43234

    Current Delinquencies -0.04 0.0000 11035 -0.07 0.0000 42985Delinquencies last 7 years -0.01 0.2484 11035 -0.04 0.0000 42985

    Amount Delinquent -0.01 0.2286 9568 -0.04 0.0000 37174

    Income 0.02 0.0091 11073 -0.05 0.0000 43234

    Table 3 Logistic regression coefficients and Pseudo-R2s

    Credit Grade Pseudo-R2 Observations

    Constant Starting rate Amount DTI Delinquencies

    AA -0.291 14.990 -0.000140 -0.83481 -0.788 0.170 2844A -0.773 16.895 -0.000156 -2.68859 -0.617 0.208 3562

    B -0.911 10.544 -0.000167 -0.93690 -0.528 0.166 5682

    C -0.356 7.177 -0.000268 -2.20924 -0.403 0.191 9610

    D -1.184 5.514 -0.000273 -1.34284 -0.399 0.162 12482

    E -3.271 11.204 -0.000652 -1.53234 -0.195 0.218 11436

    HR -3.424 9.357 -0.000846 -0.62510 -0.139 0.206 21908

    Coefficients

    33

  • 7/31/2019 Prosper BDA Paper Draft 18December2009(2)-1

    34/34

    Table 4 Improvement in Pseudo-R2 when the number of variables is increased

    Explanatory

    Variable

    Pseudo-

    R2

    Amount 0.076

    Starting Rate 0.122DTI 0.170CurrentDelinquencies 0.208All possible 0.214

    Data from credit grade A

    Table 5 Pairwise correlation table for regression variables in credit grade A

    FinalRate Amount StartingRate DTI

    FinalRate 1

    Amount 0.5618 1

    StartingRate 0.8218 0.5496 1

    DTI 0.2155 0.0999 0.1927 1

    CurrentDelinquencies 0.1601 -0.1152 0.1695 -0.0647

    Table 6 Regression model for the final rate

    Starting rate Amount DTI CurrentDeli Constant R2 Obs FinalRate Estimate

    AA 0.316 0.00017 0.005 0.750 3.659 0.639 1022 10.38

    p-value 0.000 0.000 0.000 0.000 0.000A 0.527 0.00013 0.009 0.391 2.717 0.693 931 13.18

    0.000 0.000 0.000 0.000 0.000

    B 0.494 0.00014 0.004 0.330 4.624 0.565 1280 14.37

    p-value 0.000 0.000 0.040 0.000 0.000

    C 0.739 0.00006 0.008 0.390 0.443 0.665 1535 14.34

    p-value 0.000 0.081 0.019 0.000 0.190

    D 0.683 -0.0002 0.010 0.380 3.713 0.546 1284 15.33

    p-value 0.000 0.000 0.013 0.000 0.000

    E 0.840 0.00045 0.023 0.284 -1.005 0.509 536 17.31

    p-value 0.000 0.002 0.048 0.000 0.436

    HR 0.870 0.00045 0.004 0.120 -0.260 0.676 484 17.82

    p-value 0.000 0.009 0.455 0.002 0.796