localpoly package¶
Subpackages¶
Submodules¶
localpoly.base module¶
-
class
localpoly.base.LocalPolynomialRegression(X, y, h, kernel='gaussian', gridsize=100)[source]¶ Bases:
objectLocal polynomial regression.
LocalPolynomialRegression fits a polynomial of degree 3 in to the sourrounding of each point. The surrounding is realized by a kernel with bandwidth h. The regression returns the fit, as well as its first and second derivative.
- Parameters
X – X-values of data that is to be fitted (explanatory variable)
y – y-values of data that is to be fitted (observations)
h – bandwidth for the kernel
gridsize – desired size of the fit (granularity)
kernel_str – the name of the kernel as a string “gaussian”
-
fit(prediction_interval)[source]¶ Fit the Local Polynomial Regression model for the prediction interval.
- Parameters
prediction_interval (tuple) – interval for which the prediction is calculated
- Returns
Results of the fit. The estimated function (fit) in the prediction interval (X) and its first and second derivative:
{ 'X' : X_domain, # prediction interval of fit 'fit': fit, # fit of the function at point x 'first': first, # first derivative at point x 'second': second, # second derivative at point x }
- Return type
-
localpoly(x)[source]¶ Calculates estimate for position x via Local Polynomial Regression.
The usage of Local Polynomial Regression allows to not only calculate the estimate, but also its first and second derivative in this point. Data (X, y) and regression settings (kernel, h) are saved in self.
- Parameters
x (float) – Position for which to calculate the estimated value.
- Returns
Results of regression. The estimated value for point x, its first and second derivative in this point and the weight vector of the influence of the surrounding points.:
{"fit": beta[0], "first": beta[1], "second": beta[2], "weight": W_hi}
- Return type
-
class
localpoly.base.LocalPolynomialRegressionCV(X, y, kernel='gaussian', gridsize=100, n_sections=10, loss='MSE', sampling='random')[source]¶ Bases:
localpoly.base.LocalPolynomialRegressionBandwidth Selection via Cross Validation for Local Polynomial Regression.
LocalPolynomialRegressionCV performs the parameter optimization for LocalPolynomialRegression. The optimal Bandwidth highly depends on the data (X, y) and the kernel.
- Parameters
X (np.array) – X-values of data that is to be fitted (explanatory variable)
y (np.array) – y-values of data that is to be fitted (observations)
kernel (str, optional) – Name of the kernel. Defaults to “gaussian”.
gridsize (int, optional) – Desired size of the fit - granularity. Defaults to 100.
n_sections (int, optional) – Amount of sections to devide the dataset in cross validation (k-folds). Defaults to 10.
loss (str, optional) – Loss function for optimization. Defaults to “MSE”.
sampling (str, optional) – Whether the dataset should be partitioned “random” or as “slicing”. Defaults to “random”.
-
prediction_interval¶ Interval in which to calculate the estimates, automatically set to (X.min(), X.max())
-
bandwidth_cv(coarse_list_of_bandwidths)[source]¶ Cross Validation for Bandwidth optimization.
The CV Routine is performed twice. First, for a
coarse_list_of_bandwidths, then on a finer grid which spans around the first optimal value,fine_list_of_bandwidths.- Parameters
coarse_list_of_bandwidths (list) – coarse list of bandwidths, it is suggested to give values around the Silverman bandwidth
- Returns
fine results and coarse results of bandwidth search:
{ "fine results": { "bandwidths": fine_list_of_bandwidths, "MSE": # mean squared errors for bandwidths, "h": # optimal bandwidth within fine_list_of_bandwidths, }, "coarse results": { # ... same as above but with coarse_list_of_bandwidths }, }
- Return type