Research Article

Predicting Days on Market to Optimize Real Estate Sales Strategy

Table 4

Recoding of original variables.

VariableOriginal versionRecorded version

date_first/last_seenBoth variables are in format “yyyy-mm-dd”6 new variables were created, namely year_start/end; month start/end; and day_start/end
rent_or_sell property_type“rent” and “sell” 1234 maisonette multiple rooms room studioRecoded to binary on−0 rent, 1 for sale recoded completely to numbers-1, 2, 3, 4, 5, 6, 7, 8
lister_type neighbourhoodContains many character values with the name of the neighbourhoodAn id number was assigned for the different neighbourhoods
lister_typeContains the levels “owner,” “invester,” “builder,” “blank,” agency (looks like).” The value “agency (looks like)” is a mistake made during data collection. It represents in reality either investors or buildersSince no strict condition for the recognition between investor and builder was found, the value “agency (looks like)” was randomly replaced to be either builder or investor. New variables with codes from 1 to 4 were created
build_typeOriginally the variable contains year and building materialThe variable was split in two new variables-year_built and type_built
specialsText variable in the format [\word1\,”\word2\,”\word3\,”…]Binary variables for each word indicating the existence or lack of this feature
floorOriginally in the format for example “5 to 10”Split in two new variables floor_new and total_floors