ACHIEVING A HYPERLOCAL HOUSING PRIC

ACHIEVING A HYPERLOCAL HOUSING PRICE INDEX:
OVERCOMING DATA SPARSITY BY BAYESIAN
DYNAMICAL MODELING OF MULTIPLE DATA
STREAMS
By You Ren

, Emily B. Fox

, and Andrew Bruce

University of Washington

Understanding how housing values evolve over time is important
to policy makers, consumers and real estate professionals. Existing
methods for constructing housing indices are computed at a coarse
spatial granularity, such as metropolitan regions, which can mask
or distort price dynamics apparent in local markets, such as neigh-
borhoods and census tracts. A challenge in moving to estimates at,
for example, the census tract level is the sparsity of spatiotempo-
rally localized house sales observations. Our work aims at address-
ing this challenge by leveraging observations from multiple census
tracts discovered to have correlated valuation dynamics. Our pro-
posed Bayesian nonparametric approach builds on the framework of
latent factor models to enable a exible, data-driven method for in-
ferring the clustering of correlated census tracts. We explore methods
for scalability and parallelizability of computations, yielding a hous-
ing valuation index at the level of census tract rather than zip code,
and on a monthly basis rather than quarterly. Our analysis is pro-
vided on a large Seattle metropolitan housing dataset.
1. Introduction.
The housing market is a large part of the global econ-
omy. In the United States, roughly fty percent of household wealth is in res-
idential real estate, according to a Federal Reserve Study (Iacoviello, 2011).
Between 15% and 17% of the U.S. gross domestic product is on housing
and housing related services according to GDP statistics published by the
U.S. Bureau of Economic Analysis. Understanding how the value of housing
changes over time is important to policy makers, consumers, real estate pro-
fessionals and mortgage lenders. Valuation is relatively straightforward for
commoditized sectors of the economy, such as energy or non-discretionary
spending. By contrast, valuation of residential real estate is intrinsically
dicult due to the individual nature of houses. Since the composition of
the houses sold changes from one time period to the next, the change in
the reported prices does not necessarily re ect the overall change in value.
Consequently, economists and public policy researchers have devoted con-
siderable e ort to developing a meaningful index to measure the change in
housing prices over time.
1
arXiv:1505.01164v1 [stat.AP] 5 May 2015
2
Y. REN ET AL.
The most common approach to constructing a housing price index is the
repeat sales model, rst proposed by Bailey et al. (1963). The main idea
is to use a pair of sales for the same house to model the price trend over
time. Assuming the house remains in the same condition, the rst sales price
serves as a surrogate for the house
hedonics
(house-level covariates) and the
di erence in the subsequent sales price captures the change in value over that
intra-sales period. This approach largely circumvents the problem caused
by the change in composition of houses sold. A large body of literature
extends the original repeat sales model with numerous modi cations and
improvements (cf., Case and Shiller, 1987, 1989; Gatzla and Haurin, 1997;
Shiller, 1991; Goetzmann and Peng, 2002). The repeat sales model is the
basis for the Case-Shiller home value index, published by Core-Logic and
widely disseminated by the media.
One drawback of a repeat sales model is that houses with only a single
sales transaction get discarded from the dataset. Case and Shiller (1987)
report that, over a study period of 16 years, single sales make up as much
as 93%-97% of total transactions for metropolitan areas such as Atlanta,
Dallas, Chicago and San Francisco. As such, studies based on repeat sales
data rely on only a fraction of all transactions and may not be a good
representation of the entire house market. Englund and Redfearn (1999)
and Meese and Wallace (1997) detected a sampling selection bias in which
the repeat sales properties are older, smaller and more modest than single-
sale properties. Furthermore, small samples lead to less precise parameter
estimation. To overcome this, Case and Quigley (1991) propose a hybrid
model that combines repeat sales with hedonic information to make use of all
sales. Recently, Nagaraja et al. (2011) propose an autoregressive repeat sales
model that utilizes all sales data without the need for hedonic information.
Their approach leads to an index estimated quarterly at the zip code level.
Existing repeat sales models, even those using all of the transactions,
perform the best when t to relatively l

ACHIEVING A HYPERLOCAL HOUSING PRICE INDEX:
OVERCOMING DATA SPARSITY BY BAYESIAN
DYNAMICAL MODELING OF MULTIPLE DATA
STREAMS
By You Ren

, Emily B. Fox

, and Andrew Bruce

University of Washington

Understanding how housing values evolve over time is important
to policy makers, consumers and real estate professionals. Existing
methods for constructing housing indices are computed at a coarse
spatial granularity, such as metropolitan regions, which can mask
or distort price dynamics apparent in local markets, such as neigh-
borhoods and census tracts. A challenge in moving to estimates at,
for example, the census tract level is the sparsity of spatiotempo-
rally localized house sales observations. Our work aims at address-
ing this challenge by leveraging observations from multiple census
tracts discovered to have correlated valuation dynamics. Our pro-
posed Bayesian nonparametric approach builds on the framework of
latent factor models to enable a exible, data-driven method for in-
ferring the clustering of correlated census tracts. We explore methods
for scalability and parallelizability of computations, yielding a hous-
ing valuation index at the level of census tract rather than zip code,
and on a monthly basis rather than quarterly. Our analysis is pro-
vided on a large Seattle metropolitan housing dataset.
1. Introduction.
The housing market is a large part of the global econ-
omy. In the United States, roughly fty percent of household wealth is in res-
idential real estate, according to a Federal Reserve Study (Iacoviello, 2011).
Between 15% and 17% of the U.S. gross domestic product is on housing
and housing related services according to GDP statistics published by the
U.S. Bureau of Economic Analysis. Understanding how the value of housing
changes over time is important to policy makers, consumers, real estate pro-
fessionals and mortgage lenders. Valuation is relatively straightforward for
commoditized sectors of the economy, such as energy or non-discretionary
spending. By contrast, valuation of residential real estate is intrinsically
dicult due to the individual nature of houses. Since the composition of
the houses sold changes from one time period to the next, the change in
the reported prices does not necessarily re ect the overall change in value.
Consequently, economists and public policy researchers have devoted con-
siderable e ort to developing a meaningful index to measure the change in
housing prices over time.
1
arXiv:1505.01164v1 [stat.AP] 5 May 2015
2
Y. REN ET AL.
The most common approach to constructing a housing price index is the
repeat sales model, rst proposed by Bailey et al. (1963). The main idea
is to use a pair of sales for the same house to model the price trend over
time. Assuming the house remains in the same condition, the rst sales price
serves as a surrogate for the house
hedonics
(house-level covariates) and the
di erence in the subsequent sales price captures the change in value over that
intra-sales period. This approach largely circumvents the problem caused
by the change in composition of houses sold. A large body of literature
extends the original repeat sales model with numerous modi cations and
improvements (cf., Case and Shiller, 1987, 1989; Gatzla and Haurin, 1997;
Shiller, 1991; Goetzmann and Peng, 2002). The repeat sales model is the
basis for the Case-Shiller home value index, published by Core-Logic and
widely disseminated by the media.
One drawback of a repeat sales model is that houses with only a single
sales transaction get discarded from the dataset. Case and Shiller (1987)
report that, over a study period of 16 years, single sales make up as much
as 93%-97% of total transactions for metropolitan areas such as Atlanta,
Dallas, Chicago and San Francisco. As such, studies based on repeat sales
data rely on only a fraction of all transactions and may not be a good
representation of the entire house market. Englund and Redfearn (1999)
and Meese and Wallace (1997) detected a sampling selection bias in which
the repeat sales properties are older, smaller and more modest than single-
sale properties. Furthermore, small samples lead to less precise parameter
estimation. To overcome this, Case and Quigley (1991) propose a hybrid
model that combines repeat sales with hedonic information to make use of all
sales. Recently, Nagaraja et al. (2011) propose an autoregressive repeat sales
model that utilizes all sales data without the need for hedonic information.
Their approach leads to an index estimated quarterly at the zip code level.
Existing repeat sales models, even those using all of the transactions,
perform the best when t to relatively l

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

ACHIEVING A HYPERLOCAL HOUSING PRICE INDEX:OVERCOMING DATA SPARSITY BY BAYESIANDYNAMICAL MODELING OF MULTIPLE DATASTREAMSBy You Ren, Emily B. Fox, and Andrew BruceUniversity of WashingtonUnderstanding how housing values evolve over time is importantto policy makers, consumers and real estate professionals. Existingmethods for constructing housing indices are computed at a coarsespatial granularity, such as metropolitan regions, which can maskor distort price dynamics apparent in local markets, such as neigh-borhoods and census tracts. A challenge in moving to estimates at,for example, the census tract level is the sparsity of spatiotempo-rally localized house sales observations. Our work aims at address-ing this challenge by leveraging observations from multiple censustracts discovered to have correlated valuation dynamics. Our pro-posed Bayesian nonparametric approach builds on the framework oflatent factor models to enable a exible, data-driven method for in-ferring the clustering of correlated census tracts. We explore methodsfor scalability and parallelizability of computations, yielding a hous-ing valuation index at the level of census tract rather than zip code,and on a monthly basis rather than quarterly. Our analysis is pro-vided on a large Seattle metropolitan housing dataset.1. Introduction.The housing market is a large part of the global econ-omy. In the United States, roughly fty percent of household wealth is in res-idential real estate, according to a Federal Reserve Study (Iacoviello, 2011).Between 15% and 17% of the U.S. gross domestic product is on housingand housing related services according to GDP statistics published by theU.S. Bureau of Economic Analysis. Understanding how the value of housingchanges over time is important to policy makers, consumers, real estate pro-fessionals and mortgage lenders. Valuation is relatively straightforward forcommoditized sectors of the economy, such as energy or non-discretionaryspending. By contrast, valuation of residential real estate is intrinsicallydicult due to the individual nature of houses. Since the composition ofthe houses sold changes from one time period to the next, the change inthe reported prices does not necessarily re ect the overall change in value.Consequently, economists and public policy researchers have devoted con-siderable e ort to developing a meaningful index to measure the change inhousing prices over time.1arXiv:1505.01164v1 [stat.AP] 5 May 20152Y. REN ET AL.The most common approach to constructing a housing price index is therepeat sales model, rst proposed by Bailey et al. (1963). The main ideais to use a pair of sales for the same house to model the price trend overtime. Assuming the house remains in the same condition, the rst sales priceserves as a surrogate for the househedonics(house-level covariates) and thedi erence in the subsequent sales price captures the change in value over thatintra-sales period. This approach largely circumvents the problem causedby the change in composition of houses sold. A large body of literatureextends the original repeat sales model with numerous modi cations andimprovements (cf., Case and Shiller, 1987, 1989; Gatzla and Haurin, 1997;Shiller, 1991; Goetzmann and Peng, 2002). The repeat sales model is thebasis for the Case-Shiller home value index, published by Core-Logic andwidely disseminated by the media.One drawback of a repeat sales model is that houses with only a singlesales transaction get discarded from the dataset. Case and Shiller (1987)report that, over a study period of 16 years, single sales make up as muchas 93%-97% of total transactions for metropolitan areas such as Atlanta,Dallas, Chicago and San Francisco. As such, studies based on repeat salesdata rely on only a fraction of all transactions and may not be a goodrepresentation of the entire house market. Englund and Redfearn (1999)and Meese and Wallace (1997) detected a sampling selection bias in whichthe repeat sales properties are older, smaller and more modest than single-sale properties. Furthermore, small samples lead to less precise parameterestimation. To overcome this, Case and Quigley (1991) propose a hybridmodel that combines repeat sales with hedonic information to make use of allsales. Recently, Nagaraja et al. (2011) propose an autoregressive repeat salesmodel that utilizes all sales data without the need for hedonic information.Their approach leads to an index estimated quarterly at the zip code level.Existing repeat sales models, even those using all of the transactions,perform the best when t to relatively l

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

Đạt được một siêu địa GIÁ NHÀ INDEX:
KHẮC PHỤC thưa thớt dữ liệu THEO Bayesian
dynamic MODELING DỮ LIỆU ĐA
STREAMS
By You Ren
?
, Emily B. Fox
?
, Và Bruce Andrew
?
Đại học Washington
?
Hiểu thế nào giá trị nhà ở phát triển theo thời gian là rất quan trọng
để các nhà hoạch định chính sách, người tiêu dùng và các chuyên gia bất động sản. Hiện
phương pháp xây dựng chỉ số nhà ở được tính toán tại một thô
granularity không gian, chẳng hạn như khu vực đô thị, trong đó có thể che khuất
hoặc bóp méo động giá rõ ràng tại các thị trường địa phương, chẳng hạn như xóm
borhoods và những vùng điều tra dân số. Một thách thức trong việc di chuyển ước tính tại,
ví dụ, mức độ điều tra dân số đường là thưa thớt của spatiotempo-
rally địa phương quan sát bán nhà. Công việc của chúng tôi nhằm address-
ing thách thức này bằng cách tận dụng các quan sát từ nhiều điều tra dân số
vùng phát hiện có động lực định giá tương quan. Trình của chúng tôi
đặt ra cách tiếp cận phi tham Bayesian xây dựng trên khuôn khổ của
mô hình nhân tố tiềm ẩn để cho phép một phương pháp hướng dữ liệu linh hoạt cho trong-
Ferring các phân nhóm của những vùng điều tra dân số tương quan. Chúng tôi tìm hiểu các phương pháp
cho khả năng mở rộng và parallelizability các tính toán, năng suất hous-
chỉ số định giá ing ở cấp độ của vùng điều tra dân hơn là mã zip,
và trên cơ sở hàng tháng chứ không phải là hàng quý. Phân tích của chúng tôi được sản
vided trên một tập dữ liệu Seattle nhà ở đô thị lớn.
1. Giới thiệu.
Thị trường nhà ở là một phần lớn của econ- toàn cầu
nền kinh. Tại Hoa Kỳ, tỷ lệ khoảng fty quý gia đình là làm hô
bất động sản idential, theo một nghiên cứu Dự trữ Liên bang (Iacoviello, 2011).
Từ 15% đến 17% tổng sản phẩm nội địa của Mỹ là về nhà ở
và dịch vụ nhà ở liên quan theo thống kê GDP được công bố bởi
Hoa Kỳ Cục phân tích kinh tế. Hiểu thế nào giá trị của nhà ở
thay đổi theo thời gian là quan trọng đối với các nhà hoạch định chính sách, người tiêu dùng, ủng hộ bất động sản
fessionals và người cho vay thế chấp. Định giá là tương đối đơn giản cho
ngành hóa sản phẩm của nền kinh tế, chẳng hạn như năng lượng hoặc không tùy ý
chi tiêu. Ngược lại, xác định giá trị bất động sản khu dân cư là bản chất
di? Sùng bái do tính chất cá nhân của ngôi nhà. Kể từ khi thành phần của
các nhà bán thay đổi từ một trong những khoảng thời gian tiếp theo, sự thay đổi trong
giá báo cáo không nhất thiết phải tái ect sự thay đổi tổng thể về giá trị.
Do đó, các nhà kinh tế và các nhà nghiên cứu chính sách công đã cống hiến con-
siderable e ort để phát triển một chỉ số có ý nghĩa để đo lường sự thay đổi trong
giá nhà đất theo thời gian.
1
arXiv: 1505.01164v1 [stat.AP] ngày 05 Tháng 5 năm 2015
2
Y. REN ET AL.
Phương pháp phổ biến nhất để xây dựng một chỉ số giá nhà ở là
mô hình bán hàng lặp lại, đầu tiên được đề xuất bởi Bailey et al. (1963). Ý tưởng chính
là sử dụng một cặp bán hàng cho cùng một ngôi nhà để mô hình xu hướng giá cả trong
thời gian. Giả sử nhà vẫn còn trong tình trạng tương tự, giá bán hàng đầu tiên
phục vụ như là một thay thế cho các nhà
hedonics
(đồng biến nhà cấp) và các
di erence trong giá bán hàng tiếp theo nắm bắt được những thay đổi về giá trị so rằng
kỳ nội bán hàng. Cách tiếp cận này phần lớn làm hỏng tính các vấn đề gây ra
bởi sự thay đổi trong thành phần của các nhà bán. Một cơ thể lớn của văn học
mở rộng các mô hình bán hàng lặp lại ban đầu với nhiều cation Modi và
cải tiến (x, Case và Shiller, 1987, 1989; Gatzla và Haurin, 1997;
Shiller, 1991; Goetzmann và Peng, 2002). Mô hình bán hàng lặp lại là
cơ sở cho việc chỉ số giá nhà Case-Shiller, được xuất bản bởi Core-Logic và
phổ biến rộng rãi bởi các phương tiện truyền thông.
Một hạn chế của mô hình bán hàng lặp lại là những ngôi nhà có duy nhất một
giao dịch bán hàng có được loại bỏ khỏi bộ dữ liệu. Case và Shiller (1987)
báo cáo rằng, trong một thời gian nghiên cứu 16 năm, bán hàng duy nhất tạo nên càng nhiều
như 93% -97% tổng số giao dịch cho khu vực đô thị như Atlanta,
Dallas, Chicago và San Francisco. Như vậy, các nghiên cứu dựa trên doanh số lặp lại
dữ liệu dựa trên chỉ là một phần của tất cả các giao dịch và có thể không được tốt
đại diện của toàn bộ thị trường nhà ở. Englund và Redfearn (1999)
và Meese và Wallace (1997) phát hiện một thiên vị lựa chọn lấy mẫu trong đó
các tính chất bán hàng lặp lại lớn tuổi hơn, nhỏ hơn và khiêm tốn hơn so với đơn
tính bán. Hơn nữa, các mẫu nhỏ dẫn đến thông số chưa chính xác
dự toán. Để khắc phục điều này, Case và Quigley (1991) đề xuất một lai
mô hình kết hợp bán hàng lặp lại với các thông tin về hưởng thụ để sử dụng tất cả các
bán hàng. Gần đây, Nagaraja et al. (2011) đề xuất một bán hàng lặp lại tự hồi quy
mô hình mà sử dụng tất cả các dữ liệu bán hàng không có nhu cầu thông tin về hưởng thụ.
Phương pháp của họ dẫn đến một chỉ số ước tính hàng quý ở cấp mã zip.
Các mô hình bán hàng lặp lại, ngay cả những người sử dụng tất cả các giao dịch,
thực hiện các tốt nhất khi t tương đối l

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.