eScholarship Repository eScholarship Repository California Digital Library
eScholarship > UCLASTAT > PAPERS > Paper 2004010101

Statistics Papers

Statistics Website

Policies

Search Statistics

Submit a Paper

Notify me of new papers

institute_logo

Department of Statistics, UCLA
University of California, Los Angeles

Statistics Papers  •  Statistics Website  •  Policies  •  Search Statistics  •  Submit a Paper

Data Mining Within a Regression Framework
Richard Berk, University of California, Los Angeles

Download the Paper (459 K, PDF file) - January 1, 2004 Tell a colleague about it.
Printing Tips: Select 'print as image' in the Acrobat print dialog if you have trouble printing.

ABSTRACT:
Regression analysis can imply a broader range of techniques that ordinarily appreciated. Statisticians commonly define regression so that the goal is to understand “as far as possible with the available data how the the conditional distribution of some response y varies across subpopulations determined by the possible values of the predictor or predictors” ( Cook and Weisberg, 1999: 27). For example, if there is a single categorical predictor such as male or female, a legitimate regression analysis has been undertaken if one compares two income histograms, one for men and one for women. Or, one might compare summary statistics from the two income distributions: the mean incomes, the median incomes, the two standard deviations of income, and so on. One might also compare the shapes of the two distributions with a Q-Q plot.

SUGGESTED CITATION:
Richard Berk, "Data Mining Within a Regression Framework" (January 1, 2004). Department of Statistics, UCLA. Department of Statistics Papers. Paper 2004010101.
http://repositories.cdlib.org/uclastat/papers/2004010101

 
bar
Open Archives Initiative eScholarship is a service of the California Digital Library bepress